—— a 
Ssaen we 
ase = 
eae @ 
Sea 

ee @ 
—— a 
7 : [= 
: = aay 
—— =) 


NOT TO BE TAKEN FROM THIS ROOM 


ribincne ae 


ria e Lelaleotes asain) aif aak needa 


Gx angais 
UNITASUTAIS 


Digitized by the Internet Archive 
in 2023 with funding from 
University of Alberta Library 


httos://archive.org/details/Scissons1976 


THE UNIVERSITY OF ALBERTA 


RELEASE FORM 


NAME OF AUTHOR Edward H. Scissons 

TITLE OF THESIS Convergence of Clinical Judgement: A Multitrait 
Analysis 

DEGREE FOR WHICH THESIS WAS PRESENTED Doctor of Philosophy 


YEAR THIS DEGREE GRANTED 1976 

Permission is hereby granted to THE UNIVERSITY OF ALBERTA 
LIBRARY to reproduce single copies of this thesis and to lend or sell 
such copies for private, scholarly, or scientific research purposes 
only. 

The author reserves other publication rights, and neither the 
thesis nor extensive extracts from it may be printed or otherwise 


reproduced without the author's written permission. | 


> 


' i 
‘ 2 4) v 
ae RT sen ar (cru 


THE UNIVERSITY OF ALBERTA 


CONVERGENCE OF CLINICAL JUDGEMENT: A MULTITRAIT ANALYSIS 


by 
EDWARD H. SCISSONS 


A THESIS 
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES AND RESEARCH 
IN PARTIAL FULFILMENT OF THE REQUIREMENTS FOR THE DEGREE 


OF DOCTOR OF PHILOSOPHY 


DEPARTMENT OF EDUCATIONAL PSYCHOLOGY 


EDMONTON, ALBERTA 


FALL, 1976 


QleaHtT A 
HOqAaaA CWA C2TGUT2 STAUGARD 70 YTIUDAT 2HT OT GATTIMAvE 
agaogd HT 101 QIMaMaRIUOd® ANT JO THIMITIUUT JAITAAG MT 


' 
iy ie 


YAIO@OITHT 20 HOTIOM 10 


4 


YOOIONOYST TJAMOITAQUAG TO THSMTRAGAG 


aTazaaa .MoTMOMEa = J 
ave , dia 
7 ! , 
os | a 
( j 
“ : 1 a 7 a mn 


C4 “f . | i hie. 


THE UNIVERSITY OF ALBERTA 


FACULTY OF GRADUATE STUDIES AND RESEARCH 


The undersigned certify that they have read, and recommend 
to the Faculty of Graduate Studies and Research, for acceptance, a 
thesis entitled Convergence of Clinical Judgement : A Multitrait 
Analysis submitted by Edward H. Scissons in partial fulfilment of 


the requirements for the degree of Doctor of Philosophy. 


} : ° ae . 
: : _ 
, 7 is, yr , . oe 


ABSTRACT 

This research was a study of the convergence of clinical 
judgement (multitrait ratings) across three different information 
sources (psychometric tests, interview, and test + interview). Of 
major interest was the similarity of clinical evaluations of ability 
across the three different information sources. 

Subjects (N=74) were executives appraised by a firm of industrial 
psychologists. Subjects were evaluated independently on 18 different 
traits on the basis of: test information alone, interview information 
alone, or test and interview information combined. 

Results indicate a varying degree of convergence of clinical ratings 
dependent on clinician and trait. A clinician by factor by rating 
eondi een model of executive assessment is souoneLe cea Convergence 
indices ranged from a high of .64 to a low of .05. The nature of 
reliability theory, as it pertains to clinical judgement research, 
is discussed and suggestions for further research in the area are 


presented. 


x3 ‘eae 3 
a> 


= F 


6 
tetas miele” 


: ‘ snes at? 2 ar , 
noltemxotat weivastai ,saols noltsonotai test :to < 

sbortdncy wobtemntat wotvsstat” ae 

sgnites Jecitt{s 6 sonsysavaoo to “serge gitfy1sy s dssontdt erm 

goiter yd 10 35% yd neboidls A .dtses bos asiotttto mo 

we ieee 

sonsyrevaod sfobystanst Bi shandpatos avituvsxs =e me J ie nto 

to exuten edt 4.20. to wot : ot #2. +0. figid 5 mort won 

dorsezet tromsgbut Isotaiis of eatstisq E es icadeidtan ce 


sq6 sens Sdt ai dorssess voddaut vot anoitesague bas began: 
. + 


= Pe : 
yo a 
’ ao | 
rye : 
© Be oi : 


ACKNOWLEDGEMENTS 

I would like to acknowledge the contribution of many friends 
who assisted in the development of this research project. 

The clinicians and staff of A. W. Fraser & Associates (Al Fraser, 
Jim Wuest, John Roshak, Pat Mitchell, and Libby Bolstler) gave 
unconditionally of their time and support. Without them, this project 
would not have been possible. 

I am also indebted to the members of my committee: Dr. George 
Fitzsimmons (Chairman), Dr. Charles Anderson, Dr. Harvey Zingle, Dr. 
Murray Smith, and Dr. Lloyd Njaa (External). Their advice and counsel 
was appreciated. 

Most importantly, I would like to thank Linda and Paddy who were 
forced to understand the neurosis that is research. 

Financial support during the course of this research was provided 


by a University of Alberta Dissertation Fellowship. 


Ed Scissons 


71T 


eaert 11) siete 3 
SVE | (soltalod ane : 
sssto7g eins ,modt syeretw : pins emit 
iy sone 
egtosd -a0 ssetsimmon ya to aetadmant efit oF | cia k oat 
1 Jofgals yovisl fl -,moe ead seivedd «11 tte 
Ieenucs bas solvbs tisdT Kisbxonsey sath byold eee 


( or ore 
anew ow vbbsd\ bre sbatd Yasds of sXZL bivow 7 Si aie PA 
Morss2sy) si feds. eleotvad ode alain 3 vali 
: ae Ww i 


bebivoxq 2aw dowpsess 2iiit to ‘semoo sft gaiiwh froqatie feis pe j a0 


-qidawollel noitsddeeeld. aytedla to nent : 


edoseisae ba 


TABLE OF CONTENTS 
CHAPTER 
oe SUES PROB LEMMA cee ptemram ss oe RVD soe wie ate unary siete whe eis ieee tale) eae mie ore . 
oe Velo W. Ok One mire amntine,: ss Gyn s ey eee Some Ae Ores Se aoe «Ae bs) 
Models of Clinical Wudgement <.iiccssenaaes pate <: Noneioraieicns 6 
Validity and Reliability of Clinical Judgement ........ 9 


Clinical Judgement in Executive Appraisal .......ses.s. 16 


Seni D ae otc o ns oo Omir lpia a Fenesavasloual chalet elctedhetnieters Set ave aly 21 

Pt ey SEXPSERIMENTAL DESUGH iin slew 0 viele 90's 0 oie EO TOO Go a8 n enejie\ele 23 
PROCCOUPC iy wiele ia estes 6 Ce yereveratenelorenunataiaye eratete stakeroter eles ararefers 24 
Experimental Hypotheses ....... Ooo Tae Po oa C Rreroneneraiene 26 
Iimiutattonssot the Study os. a et cterstetcistera arate erupts tue 26 

HAG) UNS SONOS) @Avaiy 5) Gu oot Oo nioonc Snood a ekol taba a <yallevanw sa shone, agate Sragoney aielin 28 
Definition Of Terms) 5.6 <0 S rere ered aunt iuaie a ecenes Gans Taare otete te aoe 28 
inter-Rater Relirabirlity: Test Condition %.....%. Someone 

Factor Analysis of Test Condition Ratings «asic. s5 ss 95 
SUMMAYY .«seeses Sma cha Sorca Bote OC Oto: Sieialteleisvenehel orcisiay 6 are 98 

Wink TOISKGSSINC) (DAG a cus O iG cio nnn HUGO OD Ole een Coun GOL Go.onood Sie 
Interpretive Problems ......+.se. aoa bag ad oa tG Rie iereeersiets 105 
Conelustons2anG mpl iea tions gam sire. <a a\cte = are) lete esta yore ess 109 
Suggestions for Further Research ......eeeeeees J 05.0.8 Lay 
REFERENCES saccccnces va Geis: « faust sintevorare Sieh oieta ie’ alarsitelalelenese Sistiultana) sueledets 114 
APPENDIX 1. ° DEFINETLON OF CHARACTERISTICS 2.5... 05. Hy mabiceGake ain to LEER 
APPENDIX 2. INTERVIEW RATING FORM ...cceceeeeee stale raver s1-alve teeta GS 
APPENDIX 3. INTERVIEW + TEST RATING FORM 2.2... ccc. ecoue waren Lee 


oo 


pm ake Sas hie 


@ 


ad EES S58 28 ee BAB n= 


ee are a ak ees 
— 


oy : Ph 
Pi. 7 7 

a v Wi 

_ \ : 
cegenensnetrensntenestes we 
ssecene Smeg Lucia ne welt ne 2 
ccecaseeeees sabenggA oven ab tomeabul ~~. 


‘ 
eas d cy ae led a= ebb AD devin 04a 


pak acai natestcnras hres’ <7 ASNe, oct =a i 


owe Cae en ee AP % oie Sigler Ste Aide a ojo © sib 2 4ies 


UJ 
' 


bee ree ate te ntetw neste sense otneee yvbut2 ids ee 7 


Chicas ak ae Pee Pay 


ciseedivesecess MGltibNOD teeT . qehitdebton ae shy 
sseeenecenses @bstted noitibaod t2sT to ateyissA soon TT 
de Ri up bills Shasta a dall .ih $0 a6/s:aate nets 
ga tecbyscsiechebetciyievesevuseuc SUSIGOS? Svivempansae 
cds edie rb bencecseesssoaes BHOITS9RI gm bos saoteuipasd 
Sis bedeageedepeece sess Momss25R torftayt ot tah nal —_ 


Nh ak ES OR eS eT oP 
Aig ap Si fegey site 2OLWAIASTOAAAHO TO MOTTIBITEM a aE 
° ar, & aa 
a btixuealea sh i alee feb echo gbseees 


APPENDIX 4. 


APPENDIX 5. 


APPENDIX 6. 


APPENDIX 7. 


DDN TOO a yA Les 1 ae acede of asl ov aliea- sue aot state seagate . 32. 
FACTOR. ANALYSIS OF 18 *CHARACTERISTICS: s.a% 62 aes 133 
CAPSULE SUMMARY OF, TESTS se. .<. ROG Or OO OO 137 


vil 


LIST OF TABLES 


Table Description Page 
1 ANOVA: Factor 1 33 
2 Reliability: Factor l 34 
3 Means and Standard Deviations: Factor l 34 
Me ANOVA: Factor 2 37 
> Reliability: Factor 2 38 
6 Means and Standard Deviations: Factor 2 38 
7 ANOVA: Factor 3 41 
8 Reliability: Factor 3 42 
fe) Means and Standard Deviations: Factor 3 42 
10 ANOVA: Factor 4 4 
fel Reliability: Factor 4 45 
LZ Means and Standard Deviations: Factor 4 45 
13 ANOVA: Factor 5 j 48 
14 Reliability: Factor 5 4g 
ths) Means and Standard Deviations: Factor 5 | 49 
16 ANOVA: Factor 6 ope 
17 Reliability: Factor 6 a2 
18 Means and Standard Deviations: Factor 6 52 
13 ANOVA: Factor 7 54 
20 Reliability: Factor 7 55 
21 Means and Standard Deviations: Factor 7 55 
22 ANOVA: Factor 8 98 
23 Reliability: Factor 8 59 
24 Means and Standard Deviations: Factor 8 59 


WL ast 


© ‘ot Dk? 


€ totosi 


# yotost 


@ fotos 


> 


ie | 
S @e22e2t ee Rn teeter eve 


2 


a 


a = 
a 
s 


snseicda toot Susheeeé bos sal , 


:enotisrvel banbuase bas anpalt | 


‘enoitsived bisbast2 bos anes! 


:coortsived busbaste bre sagem 


a eee on sce 


6 woteeT :AVOKA v 
© sooeT seotiaats 


# qotost :AVOWA 


Wo motos zytilidelter 


2 metoe? 2 AVOMA 
2 yotos? :ytilidstles 


a sotost :AVOWA 
9 yOtosT i ytitidebien 


25 


26 


ah 


28 


ae 


30 


31 


32 


33 


34 


35 


36 


37 
38 
39 
40 
41 
42 
43 
yy 
45 
46 
47 
48 
49 


50 


ANOVA: Factor 9 


Reliability: Factor 9 


Means and Standard Deviations: 


ANOVA: Factor 10 


Reliability: Factor 10 


Means and Standard Deviations: 


ANOVA: Factor 11 


Reliability: Factor 11 


Means and Standard Deviations: 


ANOVA: Factor 12 


Reliability: Factor 12 


Means and Standard Deviations: 


ANOVA: Factor 13 


Reliability: Factor 13 


Means and Standard Deviations: 


ANOVA: Factor 14 


Reliability: Factor 14 


Means and Standard Deviations: 


ANOVA: Factor 15 


Reliability: Factor 15 


Means and Standard Deviations: 


ANOVA: Factor 16 


Reliability: Factor 16 


Means and Standard Deviations: 


ANOVA: Factor 17 


Reliability: Faeton 27 


ix 


Factor 


Factor 


Factor 


Factor 


Factor 


Factor 


Racton 


Factor 


“) 


10 


ae 


az 


13 


14 


LS 


16 


61 


62 


62 


65 


66 


66 


68 


69 


69 


iB 


72 


AZ 


74 


74 


75 


HE 


78 


78 


80 


81 


81 


83 


84 


84 


87 


88 


My " ~~ 


2 oe 

| ie pee oo 

anh secrete 
oe + sic 

te sat 
{{ sotssT : Letiotzelvad biebaet2 ba 
Si aoteet +A¥OHA 
Sf xotost :ytiihdsiien © 


Pay SL dotnet :enolszsived brebas32’ ‘bos easel : 


7 Lk aotoet AVOMA 
l ay ex NovoEt 1yotLidskielt 
¥ Bf gotos? :ancttsived basbass2 bap ease : | 
sic 7 aL votes? +ytt Lidniien : 
By uf wotost :enoitetvsd btgbrete bos 7 a 
a ' 
ie | 
al ef aofosi : 
io 
#8" ‘aL 7 ‘ 4 ; 


~~ 
v7 = 
> 
> 
> 
= 


an ed 
a 
oe 
iki 
ae 
+. a 


— 
ae 
: 
<2 

7 
: 


one 


De 


53 


54 


35 


Means and Standard Deviations: Factor 17 
ANOVA: Factor 18 

Reliability: Factor 18 

Means and Standard Deviations: Factor 18 


Inter-Rater Reliability Estimates 


88 


90 


Q1 


au 


94 


G Dif Sir ud 


> 
BL totss7 : remoitsived ton 


eosemised votihdsi oon ro tsi-neaat 


CHAPTER I 
STATEMENT OF THE PROBLEM 

"Progress in psychological assessment is important not only to 
such applied fields as clinical, counselling, educational, and 
industrial psychology, but is vital also to the continued development 
of psychology as-a-whole (McReynolds, 1968, p. 1)". Modern 
psychology is directly concerned with understanding the human con- 
dition; research in psychology has traditionally been oriented 
towards the development and evaluation of better assessment techniques 
and procedures (McReynolds, 1968). 

The "clinical judgement debate", as it has been called, 
developed as a vigorous movement in psychology in the early 1950's. 
Of concern were problems such as the ability of psychologists to 
predict future behavior, the validity and reliability of prediction, 
and clinical versus actuarial methods °f prediction. Early 
writings, such as that of Meehl (1954), did much to spark debate 
between the psychodiagnosticians on one hand and the actuarially 
oriented researcher or clinician on the other. Of most importance 
was the accuracy (in all senses of the word) of assessment decisions 
based on either clinical or actuarial integration of client infor- 
mation. The controversy is far from over, but research of late 
has concentrated more on improving both methods of prediction or 
decision making rather than fanning the fires of difference that 
exist between the two (Goldberg, 1970). 

Managers, administrators, and other executives play an impor- 


tant role in modern society and are always in short supply 


ot ylne sits aboot oe tasmeze yas 
bits Lenolasoube ‘jeeedhioliticls 
eckson’ pisasios ott -oF omib Reyav ere, cs 
aveboM 6.0L .g , 8004 eblomestoh) sladw-5-28 ¥ 
-go5 Aeatud oct gaibestsusbad iste Benveores ctraorth 


| vs 
es 


i 


a 


bednasiso neod yilanoitibeas it yRoloroyed fas sonnoser e! oe 


asupiatiost tmameasecn to7sed to moltsulsve bas taemqoleveb wah’ 
cot i : " . 
|(g0eL ,2bLonysitoM) secmbdse q bas 
7 


_befiso mead esd ti 25.,"stedeb taemagbuf Leokaiio® att 


wa'O20L yvivse oft ni ygolotoyag nit tneasvom suevogiv 's ae be 
ot etetgoloroyeg to ytilids eft es dove emeidotg orew cami 
~aoitatbsig to wiiiidstien bas viibifsv edt .coiveded Shiba 
yited .dotsaPbeiq te ebortem fsinsotos eueysv Laotn aye 

stedab Aisge, of Houm brb ,(¥eeL) Diss to ted? es Aove . eyes: 
vitsiasndos ods bre bre! ono fo enbiolyeomgerboroyaq sat 
sonstieqni tsom 10 .tstito edt mo neioiniic xo verfowsssst bet oi 
attoietosb Sheantad beer sto (Laow sii to eaenee [1s mi) qomnuaon 4 


<yotdt tasiis to dottsigetat Ishusvtos so fecimifa — 


Fs 
atel to fdasecet tud .Tsvo moTt xst ai yevsvoTiaes ea 


is 


0° goitotbexg to ebadten dyed gaivergmi no stom Sete 
tei? sopexettib to eesit odd yotnos? asd wedaer: yates 
-_ 
“reg. as vate cael use st ba: 


» 


(Dunnette, 1971). This is not to imply. that managers and other 
professionals are in short supply but rather that good managers, 
good administrators, and good executives remain a scarce commodity 
in the occupational marketplace. 

Industrial psychologists and other professionals concerned 
with what makes a good executive and how to identify a good 
executive by methods other than trial and error, are involved ina 
specialized aspect of the clinical judgement dilemma. Research 
here has focused on studies concerned with the predictive validity 
of executive assessments and studies which investigated the assess- 
ment process itself from the points of view of validity and 
reliability. Thus we have studies such as that by Bray & Grant (1966) 
that investigate the specific contribution of the interview to 
over-all executive assessment and studies such as that by Wollowick 
& McNamara (1969) which look at the components of an executive 
assessment program. Other research, not specifically dealing with 
clinical judgement, has been concerned with the interview as a 
diagnostic technique (Webster, 1964; Grant & Bray, 1969; Ulrich & 
Trumbo, 1965; Mayfield, 1964), testing as an adjunctive or sole 
means of executive assessment (Henrichs, 1969; Spitzer & McNamara, 
1964; Bray & Moses, 1972), or various multiple assessment techniques 
(Albrecht, Glaser & Marks, 1964; Wollowick & McNamara, 1969; 
Campbell, Otis, Liske & Prien, 1962). 

The relationship of clinical judgement to executive appraisal 


is a logical one. Clinical judgement is concerned with assessment; 


ens nt se gin eh aa 
yatbomos sotsce 5 nitriss voviduoexe Boog bas con 
eosiqtotisn Lstoktagusce edna 

honrsoace @istotsestorq xorldo ‘brs, atelgotodoysa Lsbeteubad ; 7 
boos & Yittaebi oF wor Sas avitussxe boog & eaniem wate aia) 
& ai bevloval comp xtra bbs Uaiet asit wito shorten xd ovbrwoens ‘ 


d5asses% .cemelth tramesbvt daoiakio edt to. jreges beshieloege 
ytiiiley Svito Phong offs dtiw beiredaes aainure ag besuoot zed ane 

-apsees eft betesitesval doidw e6ibute bne:etnomerseas eviteosxe to 

bis viibiley in welv to etaieg sdt most Riseti eescong tasm fn 

wi 

(aaet) tosqd 3 yexd yd tett ep stove edibute svat ew aud? yori LidalLios | 

of welvretsi off to nolsudintacs oitivege eft otegitesunt ters : 
: 


AoiwolfoW yd tedit as nove 2otbute bas tnemeasees evituasxe Lis-eep 7 
evituasxs as 26 etaenogmos oft ts tool dotdw (eseh) saemamone’ 

: * 

ditiw gntiseb ylissitiasae tom ,ifomssees isitO 8 .matgorg THSmeesses 
6 #5 wolviotnt adv dtiw bem woncs aesed asd .nemepbut Lebhaite 7 
2 dofsiG ;@oeL .yeva 3 soex9 ¢460L .setedsW) supinioss olvzoagsth : 
sfo2 vo. avitonuibs os es gititess ,(#ael .blottysM ,coel odnurtt 
_supmbioM 3 -aestig2 ;¢aei ,2adoiansk) taemeesegs evisvvexs 30 enna ‘ 
esupindosr trompesees sigitivum esotisy 10 e(oeL ,eae2om 3 vere (Hae ’ 
;P0R21l ,stemavow 3 rae, eat ahd aad . | 


those dealing with executive appraisal are also concerned with 
assessment at a very operational level. Although most research in 
the area of clinical judgement has been concerned with unidimen- 
sional decision making, e.g., the diagnosis of psychotic versus 
neurotic from MMPI profiles (Goldberg, 1965), some researchers 
(Goldberg & Werts, 1966; Donaldson, 1969) have addressed themselves 
to a more complex multitrait multimethod approach (Campbell & 
Fiske, 1959). There is, however, little research which relates 
these clinical judgement findings from clinical psychology to the 
multitrait multimethod domain of executive appraisal. There has 
been virtually no work, other than that concerned with assessment 
centers, which relates this multitrait multimethod model to 
executive appraisal in a natural setting. It ie the use of this 
natural setting which is most likely to result in research findings 
high in peor ee a result of high ecological (external) 
validity (Snow, 1974). 

Even within the domain of clinical judgement in clinical 
psychology, most research has focused on prediction accuracy, 
stability, or concensus rather than convergence as measures of 
judgement effectiveness. Convergence in clinical judgement is 
important because it yields a measure of the degree of similarity 
in the assessments a clinician makes with respect to his clients 
as a result of the different types of data available about these 


clients, e.g., test versus nontest data (Goldberg & Werts, 1966). 


ave: i 
ao es. 


Sail ith. byareonoo, aged 2 est qasgionbot teskatis to 
2uetev attodeveq. to sdoageth vhAN 4.9 aan 
ayerowasesy omoz .(c0eL. Pores ealhiora ia a 


. 
i 


sevlsemaiif baceotbbe avert (28bL enoabLeaod yeacr ,etaew a ¢ 
PEisdges)) Hoxomage bomtoad tom riers Eke xeLaaon err 
eeseler icrdiv doxasest ey _tevSwort _2it sien? Cezar’ efit | 
edt of vectodovad Iévinilo moY? egatbatt scomagbut isotaiis ae hi ' 
@a4 Stent .lactetade eviduosxs to atemob borftamtsLun ttesti tiem 


i 


paemeesden itiw fenxeonos tad cad xedso etvow on yilerstatv need ‘ : 
of febon bodventtium tiswtitiva aids eetsiet dotedw ears : 
eid to sev si Bi tI .gobvtes suite 5 at Iselsaqqe” wereseide 
egntinit dovesder al tives: of ylexil t¢om'et ioidw gnittee leasten 
(fenisaxe) [sotgofoos dg to tluest 5 es yril (ides iisvense “ak Git 


aret .yone) exbitey 


fsetaifo mk Tremepbut ileoiatio to ntsntob edt mhitiw neva “ome 
»YostssI5s nofjotbeaq ao* ‘beausot shit dsysoese teom , ypolod 
Yo asavesem bs sonearSvacs asd 4edtst evansopoo to . Yt 
ei toemeygbot (sotnilo at sonegtevacd -2esnsvitsetts 
vitsalimie Yo svtgeb os to smeset s eblety +i seuszad SaBdt 
‘eineils aif o9 toaqees divtw eotem meioinif{s 6 edmemegeess 6 


. 
| sears twods oars 6t6b to qe frost? tb ots Xo FL fe sal g 8 a 
As Pox P 2 - 7 7 ay bias : ¥ a - A 
: TE ial iT) ee an? lt 8 er 
a ) : | | | er is 1; 1 


° 
7 
— 


This study is an investigation of the convergence of clinical 
judgement in executive appraisal. The hypothesis tested is that 
there will be a significant difference in the assessment of a client 
by a clinician depending on the type of information available 
about that client. Of specific interest in this study are the 
differences in appraisal (multitrait ratings) as a result of 
information obtained by (a) interview alone, (b) testing alone, or 
(c) testing + interview combined. 

This study is of considerable importance at both a theore- 
tical and an operational level. At a theoretical level, rationale 
for the study focus on the generalizability of clinical findings 
across data bases, nature of the interaction between trait, information 
base, and clinical judgement, particularly as these affect multi- 
trait analysis of ability, and the providing of an empirical base 
for further predictive validity studies once the problem of con- 
vergence has been accounted for. At present, there exists no 
research to provide a rationale for the generalizing of clinical 
judgement findings across data bases; there appears to be an unmet 
assumption of high convergence. 

At an operational level, this study is important because it 
is concerned with the possible duplication of psychological services. 
If high convergence is evident on several or all of the traits 
involved in this multitrait analysis, cost alone should dictate a 
judicious duplication of services through multimethod assessment 


techniques. 


pear 20 seanaeaten sare ponersAib 2 


eidatiove foltsmiotal % a wipkmits 
sit ons ybute eld? ak saandtak oftivsae 70 snot m0 2 10 


30 tivest & 26° Ceguifst pheetaistue) iezkenage ai 2 oy j 


1c , stols gaitest (d), (OLB: icy cht (s) yd benissdo cosas 
-bealdmos wolvrstnt + 


-oromit sidiod 3s ae he 1 eltisxabtanda 20 2i yhote ail? 


alenoisux .lovel feos’ yostig B' tA . fevel deco seraqe me bos rast 

gunisnit Lsoimtio io qs tiidastiessueg edt no eoot ybirse ot ba 
noztemuetat .tisth hebyted noltoeterni sit to’ owlsn -eseed, ao5b bk 

ott¥ium fostte, ssedt 28 virsivoitieq <F: tromegbu Ispintia bas « ve 

sesd Leotrigne.as to gmibtverq odt bos. ,vtitids to eteyieds pe. iy « 

«neo to meldorq eit Soro astbite ysibilev ovitoibeaqg te Eto 

on etetxe sysdt ,tassstqg TA «tot batmyeovs asad aed 3c — 

fentail6 to gnisiisasnsg sdt tot slenoisst 6 sbivorg ot 


tomau! gE sd oT etesaqs stedt gasesd stsb e20%s agnitant® tremsy 


-sotteytevaoo tigid ‘tor nein pms: ‘ 
ti Seusped tnediogmi ai: youre eid slovet sabes: sha ae 
-esoivis2! Sf estgolorioved ita nokraat tas aldizeog a —— 
etis<t ‘ankd 6 tls ap issoves 9 anabtee “ef 
gi’ 


! 
’ si 


5 a beund aaa a 


CHAPTER II 
REVIEW OF THE LITERATURE 

"To many people, the prediction problem must seem to be the 
basic problem of applied psychology (Gough, 1962, p. 526)". Studies 
of clinical judgement, which are only one aspect of the 'prediction 
problem' discussed at length by Gough (1962), have progressed 
through a number of rather distinct stages if viewed in a historical 
perspective (Bieri, Atkins, Briai, Leaman, Miller, & Tripodi, 1966). 
Research has developed from its roots in introspective analysis 
(Erickson, 1959) to studies of the validity and reliability of 
clinical judgement, clinical versus statistical prediction 
(Meehl, 1954), and on to the most recent stage which is concerned 
with models of decision making within the framework of decision 
theory. In many ways, studies concerned with the validity/ 
reliability of clinical judgement and those concerned with actuarial 
versus clinical predictive validity are similar. Both are concerned 
with improving and/or describing the decision making process 
directly, i.e., in terms of outcomes. The last stage, model building, 
has been an attempt to develop theoretical models of decision 
making or information processing as an indirect attempt to improve 
future decisions (Bieri et al, 1966) rather than to & priori 
evaluate present ones. 

This literature review will examine clinical judgement from 
three perspectives: (1) models of clinical judgement, (2) reliability 


and validity of clinical judgement including the actuarial versus 


beezeraorg sved , (Saf) dguoa yi dteasi +5 be 
fevbseteid » mk bsweiv Ti 2egste tomiteih yadtet to sodas & 


(abet ,thoglaT 3 alli" ,cemssJ  EGINE pemiAtA . _ beeia) : 


zisvisns svi FoaqsosIni: ai atooe et] mort begalsyeb bape 
to ytiftdgiios 5n6 vitbiley. ad? to esibute oF | (erer 


boitsibar4 best te ijete suarev [sotniio ,tasmeghup sea 
banseates ai spine ogatc tasSat teom ott OF a0 bre vee ttoot) 


aotaioeb Fo sieowemsit oft aidtiw satdsm\stotetoeb ve alee 
\yiibifey oft diiw bonisone2 eolbute ,2yew uae ol sea 


feiveutos dtiw bentsonos eaods bas taemegburt rsokiebe 20 |} 


ie 


hemesonos ste ftoh .telimte sss yrbbrisv avisorbaag) isotatlio: a 


_ i 
aesoorq piidem coleioeb ody gatdiaesb. 10\bae esivoxgat the . 


¥ \ , 
~BabbGivd Lotom .ogeie ten odT .2smootne to emmay mi ..90k tb 


> 


foteiosb. to elebom Inniterosns gelaies ot 


avergmt o? tqmetts sootibat” fe an Lah caty eer oe 
sieinlh £ of Wsit cedten (aces ,fs 18 eh iio 


f 


Ai MW ay ai , ae ik tt 


eon sibs ciel “eee iors Seo 


=f 


clinical dilemma, and (3) clinical judgement in executive appraisal. 
Models of Clinical Judgement 

Since the early 1960's the focus on clinical judgement 
research has been concerned with the nature of the clinical judgement 
decision making process itself. Of major concern has been the 
development of mathematical models to either explain or improve on 
the actual judgement of the clinician. 

Goldberg (1971) isolates two general models for clinical 
decision making; linear and non-linear. The linear model is that 
model expressed by a multiple regression analysis and is equivalent 
to the formulation of regression weights in order to combine 
accurately available information for purposes of prediction. Non- 
linear models usually involve some type of moderator variable 
‘effect; i.e., the weighting of one variable will vary in relation 
to the magnitude of the difference between two or more other 
variables. A number of different types of non-linear models have 
been postulated (Goldberg, 1971; Einhorn, 1970, 1971) all involving 
some form of moderator variable combination. 

Wiggins & Hoffman (1968) outline an important study which 
examines the relative efficacy of three different models of infor- 
mation combination; the linear, quadratic, and sign models. The 
quadratic model is similar to the linear model already described 
but includes the squares and products of the original linear model. 
The sign model incorporates a linear combination of 70 clinical 


signs in relation to MMPI interpretation first described by 


tremegbut ixatadte sit to! fobal a betcha niall need oad soreane . 
atts ased asd mieuno> soba 90 ,rfestt snsoosg given motekeeb 
ac svouqmi to atsiqxe werdtis oF ‘alaton fsotteasdtem to. —ae 
Te ii i eit to taomeybut Pan 
Igs5tntio «ot ‘alobom Aieaen tt eetsloei CEVOL) grsdintod! 1 
teas eb. Leb5om mB ist sat eae nee msonbi ;yniclem a, 
tnafsviupe 2i bas < Acwsniie Aiblaas distetok 5 yd boseanoxs hele 
enidmo ot tebro ni @tdytew moleestgex to noksedumrot ‘edt oF 
-1ck ,cottathssg te nenogrg no? noktsmrotal sidsiisve: Vio te USB + 
atdsinsy sotstsbhom to eqyr smoe svioval yiisyey elebom | ponkl 
noitsiet ni ysev Iliw gidisiasw eno to gcitdnisw sdt , o.i 7 jtoethe: 
tadlsto atom < ows nsswrod saretettth ott to- sbutiagsm edt OF | 
‘4ved efLebom assatl-nom to asqyt tae sttib to isdmua AW sastdeiwey - 
gaiviovor ffs ((TeL ,OVel ,awodmia ; Iver «predbioo) betelusacg dsed 
toitsnidmes sidsiusy ctotscobom to meet smo 
dsidw ybute tastroqmi ms satitvo (Sal) aemttol 3 kai Hyg 
-fotmt to sighom tossstAibh seit to yosalite avitsters ot cca if 


s 
us 


a 


oT .elebom ante bas ,oitssbsup esponis adt ;noiteaidmos 
bodiap25b ybseals Ishom uae otis ov welimte, 2 tebom ok 

-febom ‘sent l feaigine ont Leta ‘bas 2 eersbupe edt 2a bw 
tssiatto ot Tepe niins tnt psa sbowm 7g 
Wd Bodiabeeh youd | Te: cay ys Let 


Goldberg (1965). Their experiment involved an experimental design 
now classic in.clinical judgement research. Psychologists were 
required to rate MMPI profiles as psychotic or neurotic in a blind 
rating fashion. Results indicated the presence of both linear and 
configural processing of information by clinicians dependent on 
both clinician and subject samples. Clinicians obtained results 
which were similar to computer integration of information as per 
the three models just described. However, as noted by the authors, 
the differences between the results obtained by any of the three 
methods of information combination were not great. The simple 
linear model combined data in a very efficacious manner. 

Goldberg's (1965) study is further supported by Dawes (1972) 
and Dawes & Corrigan (1974) who describe two different types of 
linear models in an experiment. designed to test the ability of 
human judges to perform against even random linear models. The 
two models, actuarial (based on a regression of the criterion in 
the predictors) and bootstrapping (based on a regression of the 
judges' prediction on the predictors) were both superior to the 
decisions of human judges even when regression weights were assigned 
randomly rather than systematically. Experiments cited by the 
two authors involved the rating of psychotic versus neurotic on the 
MMPI, prediction of graduate school success, and geometric design 
estimation. Dawes (1972) summarizes his findings: "If a reasonable 
sample of cases exists for which the output values are known, the 


best way to make the predictions is to estimate beta weights for 


baiid ” | 

er (ed) roidest goiter 

fo Jasbasgeb enptsinbion ed : i t 20. em a 

etiuaex beatetde etekor ELD. . easlqude ‘sootdve com antotaiie! dite 

19q Be oobtemrotal ve colgenaegit Aatugme, or wollte ovew dokde 

,esonsus sdt yd betom 265 csv -bediaoes> TeiE! etobom asadt sd 

ssnit eft to yns yd benttetdo’ etluesr add csawind eponens Hit wie 

efqnie atT aes ict aasw aatsenidnos notssmzotat to. ebodtam " 
rene suotosottie yrsv 6 aiisis5 beatdises iabon qssmth | 

(evel) eewsd yd barsoggue redo Lt et ybute (cdeL) 2 ‘guedbiod i 

to gaqyt InstetteS ont edinpesd ow (aver) nbgivtod 3 sewed bas . in 

to ytilids edt test of Beagles’ _Joemiysqxs as mi elshom resem - 

ofl .alebom tsenil mobos: neve’ tentsgs” mrotisq: od apport hem ry 

nt nolaetive odt 26 colazesgoy 6! no beasd) Psixevtos <2Le boat Owe a 

eit Yc acbeastyat sao beesd) aniqqsttetood bos (axodotbesg. a 7 

eft oF yolsecte vod anew (anoroibetg anh de cotroisesg \tegbut ; 


beagiees stew cttgiew ucleestyst tedw nave apabur Saat at aia tl re 
2, 


— 
a 


ait { 


figiash sectinsaha ero ‘footise oat 2o noltei pes 
eidsnoesex & zy tagatbat ati nen Ce) a I 
ad? .awouy stb, eoutoy sae A Beth adabne 
102 atiiyiow!sted atomites ot zi nis ii ie toh hal 


the input variables on the basis of multiple regression; human 
judges should be ignored (p. 3)". 

Wainer (1976) further reinforces this finding. He indicates 
that, in very general circumstances, little is lost in terms of the 
original data if regression coefficients are estimated rather than 
calculated. 

Configural processing, best described as a usage of moderator 
variables either overtly or covertly, has also commanded considerable 
attention in the clinical judgement literature. Hoffman, Slovic, & 
Rorer (1968) utilized an ANOVA technique to assess configural 
precessing in the diagnosis of malignant gastric ulcers using nine 
radiologists as clinicians. Although the authors were able to 
demonstrate conclusively the reality of configural processing, they 
further indicate that even when this processing was utilized by the 
clinicians, clinician decision accuracy did not match even that of a 
Simple linear combinative model. 

Einhorn (1972), in an important study involving the clinical 
judgement of malignant cancers, addresses himself to the efficacy of 
combining components of the decision making process rather than the 
binary decisions involved in many of the classic studies in the 
area. He suggests the use of expert clinicians, in their specific 
areas of expertise, and combining these 'mini-decisions' mechanically. 

Shinedling, Howell, & Carlson (1975) combine both clinical 
‘rule of thumb' techniques with statistics to produce a 'clinistics' 


model of clinical judgement. They conclude that, "rather than 


papas Petap ot GD. :6 ng are aq) bevomy Sf biboda 2eght bg: 
cadenibds oh gutted’ alti seototaien virus (aver) sultan’ “PPS 
de 96 exes GE feof af of%sit (Asonetemoxic (sromy yoy mi eh | 
al aengae bethaitas ‘sts atdefoltieos aoleeosyet Ti Bteh fsutgiae - 
. seals 
 toterehom to egRen 6 26 beditotsb Ined , gttevsoo7q Laeusg ito or 
efdersbianoo bobmammod Sele esd ,ylsrevoo ‘10 yliusvo rattis sas: 


LesipFiaon seeces of eupindoet AVOHA an Losi fisn (B0eL) exch 
etia gated etsolu slatesy Jamsyiiem to etsomgstb edt ni adleneootg — 

ot stds svaw exodtus odd dguodtiA .2netriatlo es etetgololies 

yeit .saieessod Isaygiinos Yo yiiise: et ylevtavioaes stsyrtenomeb 
ad? yd besiiite aew grteassoxg sind mouw nave tadt etsoibal vedtaw re 
5 ‘to tedt aeve dotem ton BES yoswoos meieiseb nefsiaitos easioiails - 

-lehom evitsaldmos asenil stamte 
ted¥ares bas gatedivat qoute Jonteogut ne oi ,(ST22) ‘adoddte 
to asiiie bit oF Bedatd vexcvsbbs ,ewors> sdengi few to Poomagbot oo 
edt asd? rere ae ares naive moieioeb oft to etasasgamo gatntdnos 
edt ni sefburd vtdesio sit to vast ai heviowrl enoi aioe Yai 
oktbooge aisit al ,easisttifo tasaxs to 92, edt e%esygua SH - - 
Aiisstialioen “elipketebb-ldia’ saed? suinideos bas ,etssoxd 36 em 
es shar hcanartieapaedi doeks&S 3, Lfewok nina 


trying to justify the utility of personal, private judgement, 
psychologists should study the contribution of objective clinical 
decision-making strategies. Studying 'clinistics' might lead to 
new insights and understandings about behavior (p. 389)". 

Goldberg (1970) may be getting much closer to the truth when 
he describes his very important study which once again utilizes 
the clinical task of distinguishing psychotic versus neurotic MMPI 
profiles. He concludes that the model that the clinicians actually 
used, when applied systematically and consistently, yielded better 
decisions than did the actual clinicians. The problem with clinicians 
he argues, may not be that they are wrong, but that they are incon- 
sistent (or human!). 

Slovic, Rorer, & Hoffman (1971) carry Goldberg's (1971) 
research one step further. They investigated the reasons why 
clinicians diagnose differentially. In a study involving the 
diagnosis of gastric ulcer malignancy, they attempted to discover 
how each clinican used the various clinical signs available to him. 
Their research enables them to trace differential diagnosis back 
to a differential use of clinical signs. They cite that the major 
use of their method is in the opportunity afforded in the 'train-to- 
model' teaching of student clinicians. 

The Validity and Reliability of Clinical Judgement 

Outcome studies in the area of clinical judgement have focused. 

most directly onthe predictive validity of clinical decision making 


be those decisions made clinically or actuarially. 


e brel tdpion gages! Rep an Melbiveein. 
"(ene 4) esitaken tyods aantbretevehau bas atdgient al : yi 


asi «fue oft of -wezols domi eats793 ed yan (OTEL) atadbieD 7 
eesilf4tu qiags sono doidw ybute tnetioqgat yre? aid ane al 

IGMM obsomen avexsv oitadoyeg gcidalugctteib to dest ipodaifo edt 

ylisutos enciotiniio edt jsdt Isbom ont tuft eobulomos 3H .seltiorg 


wetted bobleiy ,vitheteiznos bas yiisoitsmstaye botigas asiw , beet 


easistaiie dtiw mpidoag ofT .eneiotatio [sutos ont DID med attoizioeb 


=nooni a8 ysds tedt aud ,gnomw ems yes tacs ed Jon Yom 2eugTs otf 
(Insound co) teeters 

(rvet) e'gusdbiog yrxs> (1TOL) osmttoH 5 ,a9t0H eoivola 
yiiw efoesex eft botsgiteevat yedT .asdimi geste emo dowasest 
odd sutvlovai ybute 5 al, .yilsisaexesttib scomgelb easiotakts 
sevooath of betqmatts yerdt ,yonsmgtiem isoly oitstesy to eteompekb 
.mitt ot Sidsiisvs emgie Lesinifo euoiisv sft beau asokmeio dose word 
vond aleomgeth isitnsset2ib sosxt ot melt esidsce doteseat tiodT 


gofem et tedt stio yeiT ~angie Isointio to say lsitnawttib 8 oF 


-ot-nistt! edt ai bebrotte beeeponie eit ot ai bodyem tieds. a 


ae she Cie. fepinifo to sews edt ini c otis sooo 
sai ten noteiseb isokait> 3 to! ytibitey ovisoliong arto | vitos tsb : 
c “yitslreutod 19) vteoiatts ebsm asvietosb 


10 


Meehl (1954), in his now classic book, Clinical versus 
Statistical Prediction, analyzed previously published studies dealing 
with the validity and reliability (consistency and stability) of 
clinical and actuarial decisions. He summarizes his findings: 

In spite of the defects and ambiguities present, let 

me emphasize the brute fact that we have here, 

depending upon one's standards for admission as 

relevant, from 16 to 20 studies involving a com- 

parison of clinical and actuarial methods, in all 

but one of which the predictions made actuarially 

were either approximately equal or superior to 

those made by the clinician. (p. 119) 

Although attempting to maintain a balanced perspective in 
analyzing the clinical versus actuarial dilemma, Meehl (1954) finds 
himself unavoidably drawn to the side of the actuary. The clinician 
cannot predict at a level that would rival even the most simple 
linear regression equation. Meehl (1954) has been taken to task by 
several other writers because of his handling of the clinical versus 
actuarial problem. 

Holt (1958) rejects as artificial the dichotomy employed by 
Meehl (1954) of clinicians on one hand and actuaries on the other. 
He indicates that clinical judgement must enter the actuarial 
process at frequent intervals. The actuary must still select his 
tests, criterion measures, intervening variables, and psychological 
constructs. How then, Holt (1958) argues, can we even talk of such 
a false distinction. Both are merely forms of clinical integration. 


In a later treatise, Holt (1970) reaffirms his argument while 


concluding that the largely actuarial model does have some place in 


Bp brisstivon neers besytens .col soi 
Yo GELtste bas SGanetat enol wstttdabren nag vEbEDS 
— eid eon traits eH .crnotelosh feianetss: te 


“Fel. taees%q ee oe 
ote oved sw FSO 

26 dotezkmbs tot ebrsh 

“mo & sntviownt eskbu: 
ifs at ,2borven Letugutos 


vilsigeuies sbem bead edz : tose to $n ves e mn. 
oy east to [6 DS yiatini xowqds nottis 5'r3 7 a 
(Olt a) M6tSiatio oft yd absa s2ort Pet 
ni svisoagetsg bsonsisd s cisdeism of anitquasts dgwon3la : “ 
abut? (H2eL) frac’ .ammelibh Ipinsuton evenev Lsokatio ot natey lens . 


asisintio sdt <ytsutos 943+ to obi2 edd oF ttwexb yidsSioveay Lisemis 
siqaie teow sit neve Levis bivow tedz leva) s 35 seie TONNES _ 

ar jest of asted mesd esd (#281) IdeeM -nolisups coleeeigst teankt , 
aveteyv Laoiatio adi to aatibnasd 2td to ceusoed eastiaw redto Lfesover ( } 
-motdong Lebamusog | 
yd Beyolqms ymotodsib od} isinttitis 2s eseeter (AGL) oh 40) PR 
tetito ad3 ne sopereisos base bosd 9110 «GO easroickio to (#205) idgott 7 
feiteutes oAt estas teu Igsuog but lsathifo teat cotsoibnt oH 41 


ain toslse {fists teue VTSUOB eat .alsviotot tasupest 76 2asn0mg i i 


isotgolodoyeq Bas chi saiagsvresai -2otbasom itotrstigd » ia t 


ifove to xket Ages) am 6D 280918 (8821) +i oli toes WwoH pase, 
Ww 


METS Ago stit: teotolts to Baca 8 Yletas eq [fot .aoktoniteks: ne i 


pa BER ee eid sibs Ma (OXGL) ake — sete a 


qa 


combining largely numerical information for purposes of decision 
making. 

Sawyer (1966) also sees the problem of clinical versus 
actuarial decision making as merely the last half of the problem. 
He indicates that the collection of data can also be considered as 
a clinical or actuarial problem (e.g., the choice to collect test 
or interview data). Sawyer (1966) concludes that the real strength 
of the clinician is in the providing of additional nonpsychometric 
information to the decision making process and not in decision making 
per se. Sawyer (1966) indirectly discounts much of the research 
reviewed by Meehl (1954) by indicating that the paucity of research 
favoring the clinical method derives from the fact that the research 
design utilized in many studies has forced the clinician to play 
the actuarial game (e.g., forced choice responses for ease of 
tabulation or the exclusion of nonpsychometric information-- 
interview impressions). Holt (1970) reaffirms this view; he says 
that studies have yet to look at clinical prediction at its best 
compared with actuarial prediction at its best. 

Meehl (1954) describes four combinations of data and methods of 
obtaining data as (a) psychometric data combined mechanically, 
(b) psychometric data combined nonmechanically, (c) nonpsychometric 
data combined mechanically, or (d) nonpsychometric data combined 
nonmechanically. More complex combinations of these singular 
combinations are also possible (e.g., psychometric and nonpsychometric 


data combined nonmechanically). However, the bulk of research that 


vy 
i 
we 


stdont at foe abd Saat ont Rea as ities catstaas sine ale v 


26 BotwbSenes od, qa neo e786 36 soiteelies edt rest nite 7 
tze7 toahilos ot estono cit . 3.3) neidorg Feinautes Some i> 6 
dtgasrte Leet ad feds <AGu lowes Agel) saywse Bist neat 


sixsemodovegio: fsniol3 ihhs to gakbivesd ait at ot apistelts att 20 - 
giitem motaiceb ni com Sn5 seco gaidem moleiosh elt o7 not sorsoted i} 
doissaey ers 26 dou etmodbalbh yliosribai (Agel) weywee 82 may a 
dseseest 30 ytisusg Sdt tedd goitsoibal yd (ecel) dee vd hewaivet — 
Hoxboeer O84 tad? tost st mort devineb hotter Isotgtis eat gaieve? , 


ysiy of metotatis oft beovot 26h seibute yaem at Seed tity mgiesb 7 


%o osse tol esenogesy sotodo beorol ..9.8) Smog tepenseg wl . 


«—foitenyetat sixtsmotoyeqnon to motauitoxe ‘ody “0 col seiudet | f 


if 
j 


: a ie ; “i 
aver ed iwelv ebdgy auvittsss (OVeL) HoH .(anoteesaqat welvresmt 


teed eff ts foitnibesq Isptntfe ts lool ot tay sve aeitbete mu 
jaod ett 76 een Lelesitos deme t 

to shoddem ‘bie ‘stab to enoisecci duos sere zadiiesh (#281) pane 
eissinsdsen beniduco steb oicremodoyeq (5) 26 62eb gai 


of (>) .xlisotdadeemaon bsaidnos ateh oie 


“—- 


bi aa sc iacreiidiad = (5) ao ad 


12 


Meehl (1954) reviews would fall into categories (a) and (b); 
little evidence is available regarding the more methodologically 
difficult categories or combinations of categories. It seems that 
Meehl (1954) is reviewing studies high in experimental rigor but low 
in ecological validity. 

Holt (1970), in his review of Meehl (1954), Holt (1958), 
Sawyer (1966), and more recent clinical and actuarial findings, 


concludes that: 


(a) When the necessary conditions for setting up a 
pure actuarial system exist, the odds are heavy that 
it can out-perform clinicians judgements in predicting 
almost anything in the long run if both sides have 
access only to quantitative data such as an MMPI 
profile. (b) A complete six-step predictive system 

is almost always better than a more primitive one, and 
even when it seems to be entirely statistical, it 
requires the exercise of a great deal of subjective 
judgement to work efficiently. (c) Disciplined, 
analytical judgement is generally better than global, 
diffuse judgement , but it is not any Jess clinical: 
(d) To predict almost any kind of behavior or behavioral 
outcome, one does better to assess the situation in 
which the behavior occurs in addition to assessing the 
actors' personalities. (e) Granted such knowledge 

and a meaningful criterion to predict, clinical 
psychologists vary considerably in their ability to 

do the job, but the best of them can do very well. 
That is they do have the skills in assessing per- 
sonality by largely subjective, but partly objecti - 
fiable procedures, making use of theories that permit 
a deeper and more valid understanding of persons than 
anything a statistician can provide. (p. 348) 


The real problem of the predictive validity of clinical or 
actuarial judgement may be escaping both clinician and actuary. Ash 
& Kroeker (1975) review the efficacy of both models of decision making. 
They would rate both as low indicating that a criterion-predictor 


match of .60 (high by today's standards for either clinicial or 


rast anoee sar eager 
<r 
vol gai ‘Sate Hpsaeni me a ce 


| i ay G 


a) 
ee a tet aetirds bis feebetfo tosbay, svom hae | (eeet) OyWRe 


cit eeennll a 


.(82er) ie acnper ane waives all at “ase am 


By 
6 qu gitittse ict anoksiba0s see 98 
tsi} yvaed ots ehho edd edeixs pence 
grirothsag af ant renesby th ampisinito herodt 
aved ashie, dtod 3r. aur gaol eit at pad 
TANK te 26 dove (a BtBh svitedisasip: of Lao at 7a 
matays svitotbeid qate-xle stelqmon A ( af 
bas ,9no ovistiming econ Bs aan? ratred ayewhs 92e f 
ti ,»lsotvertets yieuitas ed od emgaa gt wnevs 
evitostdue to Lesh teat e to eefomexs, edt aaciupet ~~ Te 
eboniiqisaid (>) -ylineioitts Atow of daomaghut n 
 ledolg usd? tstted yllexenes: at reas but fanit¢isns 
.Ssoiabls 2s5l yas fon 2b sr tuk : ur sesrtiib 
_Levoivedss 16 to bvntadt to bate NaS teomls. to A o3t (CB) - 
at coltsutie oft 22a8es os ‘astted. 2s0b smo ,omootve 4 
oft galeesces ot coltibbs: Wi etiooo rolvatisd eft dotdwo 
renbs wort cous betspao (9) .esitilsnoessg ‘atoton 
fepiakis , toibesq ‘of folsetiqo Litptianar s has 9} 
ot ytil ids ater ne vidgasbte teaoo yisy etelzolodoyag ' 
Liew yee ob nes 99s 99 Jaod oft tue ,dok edt ob 7" 
eG Raikes: tite sft sven ob ysrit af tacit ' aay 
-citostdo yitasg spit caer antietie UisHsel yd ytifecae ane © 
timisq tedd esirossd? | a i gninam .retthenorg sides tt 
agit enoeyeq to y<f bre ‘bifev e1om bss teqseb a) 
tage 4) ablvogg neo Paioticiteis s achiayas vi 
ay Joe 
’ 


ae isointis 46 vatiSiiny: ‘eubtsiberg, sii to teidesg [ser oat 
Ch a ‘to @ im 


den bf. seseuiioe Bs bas. er a ad hae Aaomogbut. 5 
Pri ao ee seme 


13 


actuarial techniques) is still appallingly low. 
Clinical Judgement: Reliability 

In comparison to both model building and predictive validity 
studies, a much more limited amount of research has focused on the 
problems associated with the reliability of clinical judgements. 
Goldberg & Werts (1966) cite several types of reliability measures 
of interest to clinical judgement researchers: "(a) over time for 
the same judges using the same data (stability), (b) over judges, 
for the same data from the same occasion (concensus), and (c) over 
data sources administered on the same occasion and interpreted by the 
same judge (convergence) (p. 199)"'. Goldberg & Werts (1966) 
indicate that problems in any one of these areas or, as is more 
likely, in combinations of these areas, pose threats to the validity 
of judgement. They see the error covariation across time, sources, 
traits, and targets as major limitations in the study of clinical 
judgement. They indicate that, "no study of the reliability of 
clinical inferences is ever likely to provide definitive conclusions 
(p. 200)". Sawyer (1966), in discussing the overriding concern 
with the validity of clinical judgements, comments that simple com- 
parisons between combinative models do little to explain or improve 
either method. 

The classic study of convergence in clinical judgement was 
done by Little & Schneidman (1959). These researchers were concerned 
with the convergence of clinical judgement over certain aspects 


of a similar data base (psychometric data). Clinicians were required 


sit ao Fer aa Saar hetimts ovom f2om6 0 
.etmameghet Lsoiaifo 1° vt kiidpifes edt dybw bereioosiig ee 
asaesam ytitidsiler to osqys tassvee atic (adel) atsel a 
soi pmis Yevo (s)" :atsiiovsoeem syamsghat Leotaiio of 


jasabut veto dd) (uri lLidete) Bm8> ames afd yatay easbut | 


4gvo (>) ‘Bite ¢ (eutemgoce>) mobasaro ems: sit mott steb smse sav i 
art yd bereuqis ini bre cclaecoo emse edt no boantetivtmbs adored, & rab 
UsdOr) efabw 2 axsdbiod, ."{00T. «gd been abee mi 

exon @t GE ,vo #5915 s2eht Io ano Yee mt emaldong tut > 
ytibilev sit ot =fseuts st0q . sbets eashit to enoltenidnos. th ot 
~epotwde ,Amir guoibs NoLIsirevos torss aft ge youT ria 
teoiallo 36 ybwie ett at enosteyiait sofso €6 deyte2) Das 8h. 

ta yiiikdseiles edd.te ybvte of seas etsasbnt wad? i 
etrotaiianos svitiniish shivorg.et YloiLL weve et ‘saonasstat {soi 
misditon BaAthiassve ofy gitlsgusetb ul (geet) tONgIES » a O08 

~moo sigwie tet? etrémuos -ednemeghut Levinile Be vithhiev 3 
svosami xb pisighe 01 alitil ob efsbom evi tandidmeo nemated aes Sal 


» ~ 
: | "Sek. 


iv 
2bw Themsgbuh teotahis fi eonsntevaAco to baa nied. 


_ ay _ = V 


bow ais erences sebdt .(EdBLYy ee duct Bre J | 

iv - aadeges: nhedags a80' hows snoniogbut. ke ne. ne una ee 

7 tee oo is snes 2hi49 bare.» sie na ee we ar 
me 


7 
- 


ane : 


at 


14 


to rate subjects using a Q-sort technique as either psychotic, 
neurotic, psychosomatic or normal on the basis of one of the 
Rorschach, Thematic Apperception Test and Make a Person, MMPI or 

a combination of several interpretive tests. Their findings, while 
disheartening for the clinician, are not altogether unexpected. 
They were unable to find a high degree of convergence across similar 
aspects of the same data base. The problems in generalizing from 
the Little and Schneidman (1959) study are manifold. They are 
dealing with a unidimensional data base (psychometric data), are 
concerned with unidimensional decision making, and are concerned 
with a psychologically "unwell" population. 

Goldberg & Werts (1966) utilize a specialized form of 
multitrait multimethod clinicial judgement research. Clinicial 
psychologists were required to rate psychiatric patients on four 
categories using one of four data sources (MMPI, Rorschach, Wechsler, 
or Vocational History). They were unable to find any relationship 
between the judgements of one clinician working from one information 
source and those of another clinician working from another data 
source. This study cannot be considered a real study of convergence 
in clinical judgement since it is concerned more with agreement 
across raters (concensus) as it is with agreement across sources 
(convergence). This study would probably score low in what Snow 
(1974) would call ecological or external validity. There seems to 
be a real dissimilarity between experimental tasks and "real" 


clinician tasks in real assessment situations. Experimental 


> 


“ ‘h 
eet 


ont Ao onto to atend dt oo Lamers %6 

10 LAMM. aber. & fet bes ta0T -npitqeoxsaah 

oft cogntbal wieft .eteot evkveupetal feveves so-nctenaiiaaeea’ 

-botosgxeny wedtagotis tof o%6 netoinifo edt 402: guinosasedatb 
suite paotos somepysvaco 10 serged dali 6 beit of eidenu exe yodT 

mon? gntsifessnge ai empidoig eff «send steb omae sit to etoeqes : 


4 _ ’ 
é 


ét6 yest «bloticsm aw voude (f8e5) danbtonhseuhinn altthd eft a 
Om, eleteb obvromodsyeq) sasd sted lscotenemthinw « dtiw gatiesd 7 
bemssoran ots bins aiden motetosh tenctensathian dtiw beaseoseo | 
nottetugea “ifewu" ylisoigoledsyeq s atiw 
to mict ae (gioceq2 & estiisy (sel) 2aqeW 4 gredhied) ” 7 
isisicik)  .dowbeax themopbyt Letoleits bottsomi tam thestitiog 
agiot no. gtiteltag .ehtieiriovag ete ot betiupst sew ataigoLodovaq | 
»Yeledost ,romdisexoR , TIN) seatece ersbh wet to 640 gnkey eelwogeses 
gitenorrsian yrs Pelt of sldsad teow ye” <Qytovmit issoivsveV: to _ 
AGivamrsthi sto most aci¥cot tetobsiio ane. 26 arrange edt asewted . 
eteh tottus goxi fai row asibiaiic totens te seed? Sida soanee 
sonsnteyned te ybute ico1 = berebtedos ad tonnes yhuwe etal sian f 
gnbmssipes fin stom bomyebiios wi ti sonte tasmegbah isoinkton nt 
eepunes save oemessgs Mtiw et rt ce (avensonco) exeter eaomms, 


woue! tedw ol wol, BYoo2 vitiediorny bigow vbinta. int eramencee ve) 


clinicians were asked to rate subjects in a manner which was 
probably foreign to them and were then chastized for failing to rate 
consistently. Sawyer (1966) would see this as a study in which 

the clinician was made to play the actuarial game. This threat 

to external validity is further magnified by the confounding of 
concensus and convergence as reliability measures. How important 

is it that the ratings of one clinician from one data source agree 
with those of another clinician using a different data source? 

Goldberg (1966), in a study of peace corps selection board 
procedures, evaluated the stability and convergence (inseparably) of 
board members' decisions regarding potential applicants. The 
relationship of board members' individual decisions before and after 
board discussions of the candidates was analyzed. His findings were 
that decisions before and after board discussion were highly 
correlated, oe in the order of .80 but that decisions between 
raters were only moderate, being in the order of .40. The study, 
although interesting, is-difficult to interpret because of the 
confounding of stability and convergence. In terms of its external 
validity, however, it must be applauded. 

Slovie (1966) indirectly addresses himself to the reliability 
of clinical judgement, particularly across diverse and multiple 
information sources. His findings indicate that, in the prediction 
of intelligence, clinicians used only two or three key predictors 
even when they were presented with (and believed they used) many. 


Additional sources of information were used only when conflicting 


sls 


| ae AO 


7 
me | 


, 


\ 


Sh 


i i" 


a 


7 
. 


— 


ee 


ete eat: saa, 5 ai Bas 


‘eter 02 beivala: ‘ae bbe eect 


al) 
doth atewbuta 5 2s eit ade biuow alin hain we - taki 


peetdt oidT  Jempyy £6 carton Sit TARR) ot sie dee et a 
- a 
46. gntSnvotnes. sdi. yd 5 b9)2ingem sleuus et ytibitey Lat ‘atne 


y P e 
tnsrrogmi wot .zetwesom yitlldsiley es eorsyTavaos brs aves 


° * * lal Fs f « > P 
eas entioe steh sno mov? astor@iis, ono lo egaitet one redy 2k 
' 


tyongoe. Bibb Mreiwatiib s sotayvagtorrlo xeiltons to soot ad 
o 


bysod toitosios eqras snseg. to ybhuse s mi ,{ aBE) adda 
Yo (Vidbisrsent) sonssxsveds Bas ywihidste sdr-betemleve: . _ conten 
4? .eteevidgqas [etineteg gaibasser sndbetaet ‘ovedarom & 
eattse bas anoted enctriosbh isthivlbal ‘eqsdmem Srsed to ee 


$vax 2anibnit citi .-besyilsas2syW estsbifiss edt 20 eihicndiacle bus 
nad 
~ 


yidgid sasw nobteyoerh baked watts bus eseted ehofatoem) 
. 
ngewrsd anoivineb tent sod, 08. te asbto edt mi, gaied shade 


»vbuva sit (#, to ashro oft nf gried ,stéevebom yittc orow BTR 


vy. \. 
sit to S2ausosa satetesct Of Tiwaittth 2i eunttesmtota® daw ¥ 
.) ts 


fansxetxe efi to eimist cil .soqeptevnot bre yvtitidste te  sniba 


-bebualqgs od teu ti ,vevewod , vIERER 
‘1 


vitiidatisa sit ot Aleemiri seepoibhs yitossibat (aser) oivele ©) 


1 : 
Siqition Sas seravib * vote ylnatni 8g ~tasmogbut Iso — 
ne 7 
woltoibeng ait si , Tedd si89i bat ‘ephkbni? eth .eedauoe noisem 
7 a} ay 
— oot satdt to a oF beau enetoinits waar ‘Leta 
i inf 


a pita. wort: sai basin teresa sod 


a Le " age Ln i 
abcde ee eae Seal adi 


ar, - [ _ 


information was evident in the prime two or three factors. A threat 
to reliability then, may be the targeting behavior of clinicians 
in reference to the information they have available. This is further 
confirmed by Perez (1973) in a study involving the discrimination 
between different types of criminal test protocols. His research 
indicates that additional information has little effect on decision 
accuracy (reliability or validity). 

The questions of why and how clinical judgements are unreliable 
(or reliable) remain largely unanswered in the literature. It is 
noteworthy that few researchers or studies to date have systematically 
investigated the problems of reliability, particularly convergence, 
preferring to further reinforce the wealth of information available 
in the areas of predictive validity. 

Clinical Judgement in Executive Appraisal 

Although the relationship of clinical judgement research 
to the field of executive appraisal is a logical one, the area has 
been only sparsely researched. Historically, the emphasis taken 
in the derth of information available that deals with executive 
appraisal and characteristics of successful executives, has focused 
on the predictive validity of unimethod (interview) or multimethod 
(assessment centers) assessment techniques. Thus, we have seen 
very little of model building, as has been the emphasis in clinical 
judgement in the areas of clinical psychology, or on reliability, a 


point in common between the two areas. 


16 


xoiitwt ek eid? vations mee postin be sit ‘gonetetet ak 
nol tenimiseib aft aatvdovire % 5 ih cever) cenit eta 
istbeess af! s2fooasout +257 sees Jo eaqys oe 
doteissh no tostis sliiil esr noitsinictat isnotstbbs tedt a atinacaa\ 

; -(ysibitey so cecal D 


gtdkitssay sts etnsmegbut Leolnifs wod ‘fas ydw Tt anoiteoup eAT Mie aD 


a. 


af #2 .eutecveril edt al betewankav visgrst atamed (oldetion so) 


A 


vilsotzsmeteye aved oteb of estiyte so etsdotssas7 wat tscy ydtvowesod 
_sousmnevaon yitsivsitise Wilidsifes to emeidogq aft sotegitesvet : 
eldslisvs cortsmtotdi Go dtisew Ssdy e167 tater tedtnut ot grbersteag ; 

‘ . “yiibiisv svitoibsyq Yo eset sit ak 


a 


feaketagh svizus xd ai tang: Lie boift 


fossoest susmegbuy Laoiatls to! Pica teeters ant viggotitiA 9°” ; 
ee be 


esi asxs ony ,sa0 Isotyol s ab ipeteaqqs svitusaxs 4o biel) oft er) i 
moat sieadgme edt ,yllsolvotelH .bedoasewst yleeusqe yao mesh 
evituoexs ftiw elssbh ted? sidsiiave noiftsmaetni to drweb ont mk iA 

f 


on 


bgeuoo2 asi (zou 13uopxS iutzesobue to eoitelastosisis bas ie i 


bout yomit fing xo (watered) soeniny to ytibifsv evicsiberg adt no 

ttese sve sw , audT -relptadsss tasnessess (eastusp tnemsasees) 
fevinifo at eiesdgme sift aesd 2nd oe eabbi ied febom vo atszatitebe | 
8 (Wiilideiles ne 10 -ygotorloveg tevkniie Io ss6ys edt nt tasmbghet 
| sapete ows Sit nested aommoo nt | to 


Ss, 8a) a an 


7 


Ulrich & Trumbo (1965) present an excellent and detailed 
summary of the personnel selection interview to which the reader is 
referred. Their findings indicate that the low predictive validity 
demonstrated by most assessment interviews may be due to contamina- 
tion of data or criterion problems. They see the major use of the 
interview in assessing personal relations and career satisfaction. 
The lack of sufficient controls on interview research is a concern 
echoed by Mayfield (1964) and Mayfield & Carlson (1972). All of 
these researchers agree that the major thrust in interviewing 

research should be internal, i.e., "studying the decision making 
process as it operates in the selection interview" (Mayfield & 
Capison, 1972, Pp. 41)’. 

Other studies on the interview have shown low stability of 
ratings (Vaughn & Reynolds, 1951) and low inter-rater reliability 
(Schwab § Heneman, 1969) on the basis of informal unstructured 
interviews. Vaughn & Reynolds (1951) indicate that inter-rater 
reliability (concensus) increases as a direct function of interview 
structure. Hollman (1972) explains part of the problem regarding 
intra- or inter-rater reliability in interviewing, particularly 
with respect to threats to validity, He indicates that interviewers 
appear unduly swayed by negative client information obtained during 
interview and tend to ignore more relevant positive information 
obtained at the same time. Langdale & Wertz (1973) add that inter- 
rater reliability increases as a function of interviewer knowledge 


of the prospective job, adding that unless the interviewer knows 


eh. 7 a 7 y ’ , 
7a x .. 


9) ee | 
me aun wah a cs, 


at s aacria edd .doinw oF we twain accede toulse Leutoeteq i te 


et es a Me 
i sok a 
; gebiine evi 9 bbs Wol SAF. ses sun deuce at af! 


dl =i ane § one ee 
.ditmetios of sub aa ysm eveiy astak Jn neeones teom d 


+ ty 
on 


a 
oid 26 ‘fu foe oft eae cad?” seme idatg: eee 


ifoitostetfs< taaino brie’ < eet ee gat a wetv AS / stan 


7 
nteotos s #i dovesse9! weiviaznl mo efertaop sass to ssa siiT 
to Tih (E9R1) moslts 6 hile elivs™é fos (eebt 2) hide oe peotiae =) 


: iB 
eniwed motni aL seows wpe oft Sd 4D I86 piemananie Gaal 
_—_ a ; 
7 
‘ te * ‘ . j 7 4 : _ ii 
gnirism notsiosbh sit weiybuse” ..a.t .lensaiat sd bivede ouwss2at 
; 
wr : 
3 bistiveM) “walyretat poigosles eit ni sethiego 22 25 239 og 
- 
- a a 
(iH og ester eneetne> 


to yviilidste wol qwode evsd wety xattrt 4dt ato 2oibute ealneg® 
5 “0 


yiilidetisy sstayvsor wol bats. lien vobtonyst 3 affguvev) a te 


. 7 
boxurouttenr Jsmiotat 16 sheed edt go (@0bL _ igmertel 3 dg 498) ) 


tatex-catnit tedd stevibst (feen) ablonyst 3 ntigusY .ewe spa 
> Tr 
sivastni to moitoasi tosaib 5. 28 esesedoni ( 2ifansono5) yiilldel let 


ity A “af, ; JS oe a 
unibiszet moidoyw sd to tsg i ri (ST@L) asemifoll .surTouwge 


=~ 


ylosiistir4q ~Ratwelvisial tk wilids: fet ustat-retar bi 


_ 7 


eysweivastal isdt aerpothrt, “ gam ot 2tssnit 0 v8 3 <a f 


gciavb baaisidoe stitedsoist fou ae avitezen vd beysve ytuben ute 2 
; oie 

! i‘ =e a. 

aottenrsotnt eyisteog rasve 


fers ah aaa? aeoehete B * 3: 


~ SoS * tery Bes’ (even), oid ota 


: _ ae 


egbo vom. xown ivadset 1m oa wi ee ae é 325 _ ’ €! A Ne san ; 


ae ize his. 
avon Sasbihasis ‘Sf sit eal Fie gribbs. 
7 i 7 4 ae - 


18 


the prospective position thoroughly, inconsistencies are inevitable. 

Other researchers, in discussing threats to the reliability 
of the assessment interview (and indirectly, validity), have focused 
on other areas. Baskett (1973) indicates that a major concern 
should be the similarity of interviewer-interviewee attitudes. When 
these attitudes differ markedly, interviewee ratings suffer. 

Lipsett (1964) argues for the use of interviewing, saying that much 
of what we think we have with personnel tests (validity), never 
really existed. 

The literature on interviewing in executive appraisal, while 
plentiful, does not answer much in relation to clinical judgement. 
We know only that poor or ineffectual decisions are being made. We 
have little indication of why or where. 

The only area of executive appraisal to which a modified 
form of the clinical judgement research may be applicable is the 
assessment center. The assessment center, first commercially used 
by the American Telephone and Telegraph Company to assess managerial 
performance and potential (Bray & Grant, 1966), is an adaptation 
of German psychologists procedures for screening officer candidates 
(Dunnette, 1971; Blumenfeld, 1971). The assessment center combines 
performance appraisal techniques, such as the interview, paper- 
and-pencil tests, in-basket exercises, leaderless groups, and 
simulation exercises to formulate multitrait ratings of candidates. 
Traditional clinical judgement findings are not directly applicable 


here since ratings of several psychologists, managers, or super- 


foum stadt anivse aah aie aR te bax elt sot esugas (#8el) tteeqhl 
asven ,(ytibiievi eteer fainmpeteg itiw sved ow dakds wy: tedw, to, 7 
sbeveixe ie Ae 
elidw ,iseisxqgs svitwoaxs nk pe Renee so suutemstit, emt) vg : 
-taemeghet Isatnkis, 23 porzeies af ovo reweos tom geob intisasiq: 
aW .sbam nitled ans ecnoleiseb Leussetisat ro se6q ted? yliac. wont of 
| obra to yw Yo aokseolbal oftshheved | 
bettibom s dofiw ot Ieeistgaa evitucexs to Bets yin od ln 7 
git si efdealiqgs,; ei ysn domse2o9 Fremey but Igoinifo edt Fo mxot | 
bear yifsiovonmec -teti? ,netaso! Thamesozeb 3AT . 1echedo Insmaas2es 
isigegedam eescas oF yobqne? dgedgsis? bas ssorqefeT asourteiA sda yd ; 
pas ee Fe ae ab , (6neL heel 3 yee ) intsagtog bets i aid 


zetsbibnao assitto gaiaesid2 wot esqibsoorg stebgofacoyeg near ar 
Wy : 
ssnidnos tstu69 Inemeas2ee adT .(IVOL) Titan NOL .pazenaa) 7 
-qaqeq ,Wweiysefab eit as dove seonpiaitess ee oes 
Al 


bite , equota| eeelashast nw aa hats hath ige 

-sesehibisn to: eg ites diecteoiim eatin 09 £ 

cidpottage yiroertb ton ie shtbaz® 
~8qde 10 satogensia -2teigele 


visors, although derived independently, are combined for purposes 
of final assessment. Dunnette (1971) describes the relationship of 
assessment center findings to behavioral ratings obtained on the 
job. Correlations ranged from the low ¢.20's to the high 1.70's 
depending on the trait measured (see Appendix 5). 

Bray & Grant (1966) studied an assessment center initiated to 
appraise future managers for the Bell Telephone System. Their 
findings indicated that, although all predictors were used for 
making ratings, considerable inter-rater variability was evident 
in combining the data. In an aspect of the same study, Grant & Bray 
(1969) dealt more specifically with the interview information 
obtained in the assessment center. Their findings indicate that 
structured interviews are able to yield reliable and valid indicators 
of future performance. 

Wollowick & McNamara (1969) in their research which studies the 
use of the assessment center with IBM managers, found that adding 
information received from situational tests increased predictability. 
These researchers also add weight to the actuarial versus clinical 
debate by adding that a statistical combination of the assessment 
center program variables was better than any single subjectively 
derived overall rating. Henrichs (1969), in dealing with the same 
subject pool as Wollowick & McNamara (1969), indicates that a careful 
analysis of employee work records was also highly related to future 
performance. 


Moses (1973), in a more recent study of assessment centers, 


19 


oft m0: toatstdo ego! tes Lexglvatiad ds eglibnt 

alot.2 dgid edt or 2'0S,+ Bat ro tes ‘ac 

(2 xbhiteqgé 592) bertuson, tisir of at 90 | 

ot botsitini astaes trenmseoeeb aE batbinte ae 7 iH 

sxistT .meteye sotqelsT ifea ait wt “esac eat ang ro | 

vot beans sxsw avotoibexg [fs devodiis.,%sd% beteokbat apatbekt 

troLive 2sw viilidsiasy sevex-werqi sidersbiemo -aytiten gadaen “vn 

yeas 3 tnsv? ,vorse smse edt to i ae ns ai -etsb, auld gtemitdaoe ak 0 
aottaorcing walvretds sad iti rile ttoaqa® 9 TOuT ctineb (eaek) 

jst} ofetibai escthnlt ated? .assa00 toomeesess Sit ar eae a 

etoiecibat bifew brs sideiles bleky ot sids sce ewatwrermt borwsourte a 

) 2omRarrottsg acurtst Yo i : 


oft esibude doidw doveseso ried? mi (2d0L) sasmeviol 3 ASiwolfiow =) 


giibbs tads bruct ,27936ce0 WSL ane wines tasmageses ent Je seu 
-vtilidstotisag bsesousni ateat bic ween mort bavisoend neh a ; 
Isointin esexsy Isideytos att et ae bbs cade esedonsenon seed 
tnemeeoeds oft to noltsaidmos Leptteizes 2 6. tent gibbte. atau 
ylevirss (due afgncs its nat det 9c e5W saldstaey narpo%g "wh889) 
embe sit dtiw gad ilssh mi (ened) aiteb acon eittes Llsreve be | 


tuteres 6 teat: gorse tbe see) meMOM 3 Acivelfow es Loog : 
eristut oF bese fn't yirigid: babs aie saxoiqme al 2. 
AY if , 


~ 2 


20 


reinforces this; he notes increased validity of assessment center 
predictions as a function of increasing time between prediction 
and evaluation. 

Albrecht, Glaser, and Marks (1964) use a multiple assessment 
procedure that is really a forerunner of the assessment center 
approach. They were unable to find significant validity in the 
procedure using a multitrait multimethod matrix approach, but their 
research was hampered by methodological shortcomings. Criterion 
behaviors were evaluated by superiors who had little contact with 
candidates or by peers rather than by direct supervisors. 

Bray & Grant (1966) indicate that many of the key character- 
istics measured by the assessment center can be obtained by an 
interview, a finding suggested by Glaser, Schwarz, and Flanagan (1958), 
but one that is at variance with more disheartening research on the 
assessment interview (Webster, 1964). 

Blumenfeld (1971) sees the greatest benefit in assessment 
center methodology as the equal opportunity afforded candidates, 
use of trained assessors, and situational exercises high in what 
Snow (1974) would call ecological validity. Wilson & Tatge (1973) 
are less optimistic; they see the assessment center approach as 
very costly and not necessarily better than more traditional methods 
of assessment. 

Trankell (1959) describes a study which, although it deals 
almost exclusively with predictive validity, is noteworthy in terms 


of the present research. In one of the few studies that used 


ot tion Le 


jremeesnes oiqi stem 6 set! a ‘io, baw \ashint® eee 
astne> tnemeasees edt io sean 370% 5 viseea ef 
sat ci Hes dopa tt tig he bait ot | »idsnu © 2 w = f 
gtaniy tud lomotaas xhecser bortdemitiom, ingots tem 8 “gatas pad 0 
cottetins  .eyakmoorrode (potmofobodtem- yd se here’ Sew doveians : 
d¢iw tosteos alfstl bed ow eqoinsque yd bsteuisve atsw cxdiveied 
-aropivasgie -to9kb yd medt senjst eresq yd "fo aac 

-Yotopiedo ys4 add To yosm sett eseokbal Cadel) tase et ae 
as vd bawrstdo sd aso “ettes tranecotes sdi yd cccdenbaa 
.(8@8L) aenedelt bos ,asswioe .teeBho yd botesygue gai bait B aobvrsint 
edt no dowsesex enins*tsedsth Sion dtiw oonsiisy ds at tadt sho tid 
/(WaeL pasredoW) wetvtstat ieee \ 
tcemeasees oe dstened 1e9769%B bit pase ( {VOL} bistroml® alt 
setebibass bsbwoats y hitxogge leupes sdt 2s ygolabod:tem waa : 
sendwoni dgid aselorexe tengi toute bas Ses hontext je sey ve 
(evel) sgtéT 3 noelivW tse fisoig¢toos fist bivow coven) vere 
ab dose1ggs setcss rosaaasees i ‘eve yor? :oiteimbygo ites 
ebofitem Istoltibsit sxom aedt bt yilessasssa ton bas yiteos Pee 


reenable 
bash dt Aquoiiste son ia (P2EL) a 


= 54 


iat yitoveton at .yribiley evitoibexg (tin yleviay 


hh thes 


psychologists exclusively as part of an industrial selection 
procedure (air pilots), candidates were rated on a 14 variable matrix 
on the basis of a clinical integration of paper-and-pencil tests. 
In what he describes as a "craftsman's job (p. 174)", Trankell (1959) 
describes how the integration of tests by a competent psychologist 
yields excellent results in terms of decision accuracy. He argues 
for the intelligent use of tests as predictors indicating that, 
rather than arguing relative merits, the strengths of each should 
be combined. 
Summary: Literature Review 

l. The general area of clinical judgement has been well 
researched specifically from the perspectives of predictive validity 
and model building. The area of clinical judgement in executive 
appraisal is only sparsely researched and the nature of that research 
has been primarily predictive validity studies of interviewing and 
assessment centers. 

2. Clinical judgements, although they may be configural in 
nature, are pieamaeeies described by a linear model. 

3. The linear model, whether it be used in a bootstrapping 
or traditional predictive manner, is at least the most accurate 
method of combining mathematically represented information for 
decision making. Even when beta weights are estimated or applied 
randomly, they better or equal a human judge working with the same 


information. 


4, There is little research on the reliability of clinical 


MALE tdatsqmoo 5 vd ed 36 nottergszai ot | 
esupig oH .yoBiuaoB Mote f996 Yo amt rae asleae peepee 
der pphraoten erotci bem 25 or to seu Inogitteant say ao% a ™ 
bistone ross to arizgaatta off . 2itcam evizetor aclu asa sorters | 
bei | 


rs , 


weived pauseretid : 


tfow nesd sed tramesbue shbtearto t4) ‘gee cased adit t ‘< 
vtibifevy ovitsiberg to aavitzeqeisq ent mort yiisottissqe | badoasezet 7 
evitugaxe ai fn emagbu fecinkis. to Betts sdT -gnibliod fobom bas : 
doxses2e . thdt To sai"hes re bos bsdorssest yloeisge Vino) 2k keatergqs A 
bis gctwoivestni to asibute yiibiiev svitonbeig Wicemiag ‘et i 

ni Lexusttnes od ye yond, bee <atromeg out Isoiatl:, . ew os 
.igfom isentl 6 yd bedinoeeh vietausshs aah pee _ 


unigqsttajoof 6. qt beeo ed ti aodtedw » fsbonr reenit rT ae be ni 


sism@oos teem sat tesst a, et ison ag evitotbeng nines, 7 


: i. bted erty neva -ecblen, cota 
ons att dtiw galArow oot xo rested ba aie ; 


22 


judgement. This is particularly true of convergence. What reliability 
studies that have been done have been concerned with concensus and/or 
stability. Convergence studies, when they have been attempted, have 
dealt with a similar data base (test or interview) or have been 
confounded with stability and/or concensus. 

5. The majority of the research on clinical judgement, 
particularly that dealing with model building and predictive validity, 
would rate low in ecological (external) validity (Snow, 1974). 

If one views generalizability as a function of representativeness 
(Snow, 1974), the majority of the studies cited have been well off 
target. Typically, clinicians are required to rate subjects on 
variables that are foreign to them, using criteria and rating 
scales totally alien to their usual method, and are then critiqued 
for off-target behavior. 

6. There exists at present no study which investigates the 
convergence of clinical judgement in a natural setting. This is 
particularly true of a natural, applied, vocational setting. 
Reliability is an extremely important, albeit ignored, concept in 
clinical judgement research (Goldberg & Werts, 1966). It should be 
noted that validity is unknown if the problems of reliability have 
not been accounted for. At present, the apple cart appears to 


have usurped the horse! 


sont pen mead aa yet Bing neers 

feed ovsit-xo (welviorat «0 deer) sed steb ast sin ws ani 

_— .aetinartes e\bais wee Lidewe” int sbave! 

.tnomsgbut Espinilo co dousses eft to Winx si i: we 

exikbbisv evidoibesg bas aniblivd [stom dtiw gotised ‘tet eae us oot 

((8ROL wore) Witbiiev (lenwetxe) {eotaolosg Bh WoL aaa 

Zeomevistetusesiget to fois jaut B& 26 ~“tiitdssilersnep suai 

3% {few weed oved betiv eothyte edt to ysivohem |Add Covet <node 

tc etoefdve etst of bevivpet atm eanloimifo PRE E | ti 

gritey bas siastivo gtiau ,mens of ayistot sis ets 2) ica 


beupitisc modi ous bac . bortem Leveu viedt or otis Gun 


a " a 


eh ald? .gaiftes Iewise s si snemeygbuf isssntio to — B 
Salas 


ee 


nae 
” tasomoo ,.betonri thedis .tastasqm: ylemhisxe a8, at eid 
(i 


ed Bivode 37. .(G28L ,errsl S H1redbied) dowssest tramagbut L on L 
eved ysilidsifes 30 emeidorg edt 4) nwomlay ef whan 


~utisdse isqoitscov .beiiqqs .letiss & fo suat~ 


2 
r. t 


tt 


ot atseqgs aso siqqs edt .dneeeng tA 


CHAPTER III 
EXPERIMENTAL DESIGN 

Clinician Sample 

The three clinicians involved in this study are all profes- 
sional staff of A. W. Fraser & Associates, a medium-sized, locally- 
owned industrial psychology and management consulting firm. 
Clinician #1, the chief psychologist, holds psychologist registra- 
tion in three Canadian provinces, has over 12 years experience in 
executive appraisal and many more years of clinical experience. 
Clinician #2 has a B. A. (Hon) degree in psychology and over five 
years experience in executive appraisal. He was originally trained 
in executive appraisal techniques by Clinician #1 and was supervised 
very closely for the first three years in what might be described 
as an intensive and very highly supervised clinical-industrial 
internship. Clinician #3 is also a registered psychologist and has 
three years experience in industrial and executive appraisal. His 
most recent two years of experience have been obtained as a staff 
member of A. W. Fraser & Associates. 
Subject Sample 

Subjects utilized consist of recruitment and comprehensive 
appraisal candidates processed by the clinicians of A. W. Fraser & 
Associates from a time beginning with the inauguration of this 
study and ending when each clinician has rated at least twenty 
candidates. This covers the period March 1975-December 1975. 


Recruitment candidates are those candidates who have applied for 


23 


segtosg Lin sis yute etdt cpheit wbian ° thet 
-<ilsool ». besie-muibem 5 vedsafoouek i teene3 .W A t9 Tete Fake 
wnat? actsfuencs sasaageqsm bas vaptonsyeg tabserenbet Been 
~auteiger tataeiodaye7 aitod .rebgotodbyeg Apida, edt ih ity 
it sonia atssy Si wevo: Rad es on kveng astbedts) gots ah woke a 
.pomatteges Lepintio tc subey- etc yasm firs Snekerqgs' ovEtoaBKd i 
avid weve bts ¥golodoyed of sengsh (act sh 8 « eed Mh -melotakto os 7 
beninad yileaigtdo zaw oH ~, iedisdqqe svituoaxs at eofeineqve arsey , 
becivreque sow bon £4 aninintlo yd seupindoss lezienaqe ovisviexs Mk | 
bediwsash sd sdgte Sedw ni 5 aed sais tatit si? sot vissote yey | ' 
taivjgeubsi-Leoiails bee tvrsqts  elidetst yter bas ovienetat me a5 ia 

asd. rts Siuchaeinan bavetaiaoy 6 cals et &% asisimts9 .qinanvetat ro 
ath .teetsiqus avituosxs bas Ieindeubat nt oonuthegke sibel” Sede a 
tiste 5 ge benkstde coed sve sonst tsq%s To ST68y owe snecatiaditl 
.eefsisovsA 3 88609 .W 4 to vedmem | 

a4 


3 yeeet .W «A to ensininils sft yd bsessno1q asthe ae 
2idt io doisswwguent sii? cidiw Batcciged amit 5 hott e 
Yidews tyeseL ds, betex se ait Hosa cedw antbae wa 6 

ever Racca coh ‘ott batieg odt anevon aid? 


executive positions through the recruiting division of A. W. Fraser & 


Associates; comprehensive appraisal candidates are subjects sent 
to A. W. Fraser & Associates for assessment by their own companies 
in order to assess future development potential within that company. 
Procedure 

Definition of Traits 

The definition of traits or characteristics of concern to the 
three clinicians of A. W. Fraser & Associates in assessing executive 
talent, were arrived at by a process of concensus by the three 
clinicians involved. Concensus was obtained onthe number and name 
of the characteristics that 'make the difference' in executive 
performance and on the definition of these characteristics 
(Appendix 1). The three rating scales (Appendices 2, 3, & 4) 
used to quantify these characteristics had been in informal use 
in the organization previously but were modified to encompass the 
18 key characteristics arrived at by concensus and the three 
information sources (test, interview, & test-interview). 
Experimental Procedures 

1. After completion of each assessment interview, the 
clinician completed the Interview Rating Form (Appendix 2) for 
the individual interviewed. This completed rating form was imme- 
diately returned to the office secretary for safekeeping and was 
not further available to the clinician. 

2. The subject was administered the following tests as part 


of the appraisal battery: Differential Aptitude Test (Verbal and 


24 


a ; 
add oF mreonoD: qo Softersstoenedelt pie ‘06 
evitusexs goleteren al cetetlozeh 3 poriyce awe fa 3 aul > 98 
ssadi edt vd dvensone> to eesSoug & vd 75 Savion one 

smea bas ywdnwa srftire nee aw euansatiod ial sf 
svitinexs mp \"sotteusti bb ae SAem! Fels, cotvetnesawsts alt 20 
tosses saadt to i et Eg ec? a6 ae sommes 

Ca e898 asst as iee goiter ooxid of” “(ibaa a 

e2u [eorictal at mead Berl offprint lier atan - 

ant! Besqmochs ot PA) oD snow judd yale bahar bes 
ssvitr gle bis sseneones é Ts, ian ea i 
henry sea 3 wavs gat ca 


oft |. ws iveatai iis Teasbes tinss to jee 
sot (9 ah bestih mot gue ven pet ot Lamoo 1s 
-omat asw mot goiter eretgnoe eat Fr 
wbw bas igus ‘To roe s ce 
F Hg.6 " 


Abstract); Wonderlic Personnel Test; Watson-Glaser Critical 

Thinking Appraisal; Test of Business Judgement; Test of Practical 
Judgement; Supervisory Practices Test; Management Aptitude Tnventory; 
Holland Vocational Preference Inventory; Edwards Personal Preference 
Schedule; and the California Psychological Inventory (see Appendix 7 
for summary description of tests). These tests comprise the usual 
executive assessment test battery utilized by the staff of A. W. Fraser 
& Associates; infrequently, additional tests are added to this battery. 


3. The clinician was provided with a copy of the profiled 


results from all tests administered. Using the test results and 
interview impressions, the clinician completed the Interview + Test 
Rating Form (Appendix 3) for that candidate. This completed rating 
form was immediately returned to the office secretary for safe- 
keeping and was not further available to the clinician. 

4. Approximately two months after the clinician had com- 
pleted his required number of cases, he was provided with the test 
profiles from every subject he had previously rated. These profiles 
were made available to the clincian singly, in random order, and 
without identifying demographic information. The clinician then 
completed the Test Rating Form (Appendix 4) for each subject 
individually. This rating form was returned to the office secretary 
who collated the three rating forms from each subject. 

Analysis Procedure 
Ratings for each of the 18 characteristics variables (Appendix 1) 


for each of the three rater conditions (test, interview, and test+ 


25 


- 7 is a 
. ons tuollank alti reyh Irvamanectem an deol fost ron PVLOqUS 7” 
; \ 


esnstetest Lanoeiet absbwbi :yaotieval ssteteterd so OV 
T MibtteqgA sec) yrotnevil fs sinokertayet plevoitis? sit aia 

feues sit seiaqnion et2s1 seadT” «Cetaet to notsqiaseeb¥ 
wena? .W.A 20 Viste edt yd bscifite yustted Jest TuStaperes: a 


-yxstted eit oF ‘Sebbs Sis ere0° lanat?tibbe eVisnsUpor tre 'y iaetsio 


. 


belltoug act io ydoo & (thw bebivesg tsw asicintts sd? Ee : 


7 : 
boo etiveot feet sad astad Shex@teratmbs etesy fis mort eats 


jaeT + wotlketat sit oersl¢noo matormtifo off ,scorsaerqat eicoi 
an 
ocites Setelasto aint. ste5i bess fect) 261 (€ nEberac ney ast ont zs! 


aT 
atts. yot, yastercve onitic elt oF bsasitor vietsibemer enw ons 


saatotaifo edt of sidelinve «sito ton ssw bee ge 


~ 
<sido befi natoiahin eit xedtts piftvem own yloremixodggé’  .F * 
a | 
test adt Utiw bebivecys acw off , ages to yodmim beafupss Sid Seam, 


\ ’ 


aeiijieiug seaiT .betex yfevotvetq héd ed toardua yasys Bort aels tot 


= 

px DY 
; 5 vile . ge oad Ud 
‘bas ,tstto mobae1 al.,yignte cetomils att oF sidsfitsvs Shem Sm 


aa! 


agat ceiointioisiT -.coitserrctar oldas7gcitsb, yrpIIARBE wrote 


footie dake aot (Ht xtbasagé) amo gartsa t2s7 aft beteLquas 


=—_ 
Sr 
wxrsietes2 S2i2ie si) of honwiried #AW orick guliia ait -yilenirerbat 
; 


t5eftivea ifoes mas? east anitet seat eit Bergh, 


otibesos atey Lenk 


~ ae 
> 


ae | saiaanaeh. evtdetncy 20178 trotopneib BL cial 40 ssaaills 


da 
+3287 titeicnbiaaebe +Teer) etotaibcs veres sen aoe 7 3 


cok eee 


7 ks 
: ah : y : a. vs 
+1 Deas iv, 7 
4 AD iby 2 


interview) were analyzed by a one-way analysis of variance with 
repeated measures (ANOVA). This was done for each clinician 
individually and for all clinicians combined. If an Feraitio 
obtained exceeded chance, individual comparisons between rater 
conditions were undertaken by the Newman-Keuls method of multiple 
comparisons. The reliability of the three ratings of each charac- 
teristic (factor) were also calculated as per a procedure 
outlined by Winer (1971, p. 290) and Ferguson (1971). 

Experimental Hypotheses 

There will be no significant differences between the means 
of the results obtained by any of the three assessment methods for 
any of the 18 characteristics for any of the three clinicians. 

Limitations of the Study 

This study is concerned with the convergence of clinical judge- 
ment across information sources with subject and rated character- 
istics held constant. Limitations then are limitations imposed by 
rieecueoieted perspective. 

1. No information will be available regarding the predictive 
validity of clinical judgement. This is not a study of predictive 
validity in clinical judgement, but rather a study of a specialized 
aspect of the process of clinical judgement. 

2. Subjects were not randomly assigned to clinicians. 
Although no overt bias is present in subject assignment at 
A. W. Fraser & Associates, systematic covert bias in subject 


assignment cannot be excluded from consideration. In actual prac- 


26 


Jggbvt issinbis te sense evaes ald Ati“ hemrone: 2f ybute aig? 9 _ , 


7 


aaa a MI .bentdnae bs 


besiisivega & te ybute o todtet sod .jasmagbuy fenialis r 


Gaede ‘shat heen ein 
aightiva to bodtem alve,’- ~asuwe ext vd nodetisbau sew 
-cereds, dose, to egaktas soxd7 ic to ywtilidsites edt mn 

acwhenorg 8 ¥sq 26 hetsivsiso cele. axa4 (103052) sisabres © 


(EMRE) aocugast bas (ues «c LVL) aeaeh vd bembltvo 

: - 

eqeotizogylty Laisenttegyd we 

SREGA exe aeouted asonetettib Qosotiiogte of at Siiw soul. . ae Lp 
‘toR abodien gnemansee eect off do, vas yd boaletdo atiaeem orto 
sagetotaiio sexis eft In Ye xed Shs veksesarrene BL ait to. yas a 
bur? ost, Fo Botts ied 


-rotepwads badet bie tooiduae dtiw tepuoe nottsarrc tat pie 
ya bozoqut sftoltetight sts deds esoisstimil .Jaesenoo bled 2 
A a, pyltpegaisg betointaet 
ovisotherq oft pabbanest sideiieve sd Lite ‘tottamsetal of taigh ‘ 
svitoibagy 20) vbutte,s ton Si teil .tnenegbut Laciatto Yo pre 


Le a 
7 
7 

- 

7 


pie y 


: trsmegtutt inoinlto to aaesea edt te s 

.enefolalls 61 baugizes ylmobites tom sxew etoefdue .S _ * 

aie SAIS AERA AIL. 2 FS OR | 
|| 9) seatdum at eld sroves, sivenetaye enti: 


aL 


tice, each clinician is assigned to certain specific assignments 
based on his time availability and would see all subjects associated 
with that particular assignment. Snow (1974) would see this as the 
compromise that must occur between ecological validity on one hand 
and rigor of experimental design on the other. 

3. Subject (client) selection was not random. Subjects 
can be considered to be representative of the types of clients who 
undertake executive appraisal. 

4. All clinicians are male and all of the subjects are male. 
This may preclude generalizability of results to female populations. 

5. Clinicians are not of equal training and experience. 
Although this has been seldom realized in a study of clinical 
judgement, there is a possible, but undetermined, effect on the 
generalizability of research findings. It is possible to investi- 
gate differences between clinicians but clinician sample size is 
far too small to investigate the effects of clinicians' charac- 
teristics on judgements of subject characteristics. 

6. The possibility of clinicians' remembering profiles from 
the test + interview condition when they rated profiles. in the test 
only condition is remote. It is, however, a possible weakness of 
design. The two month delay and the volume of work processed in 


that two month period did much to minimize this possibility. 


> 


oo a a ae 5 Gnoirs) tootae poems 8 


i] 
- ; 
CO | 


. 
-_ ~ _ y 
: 

a7 


- ia 
‘a 0 
sPtoel: 
aa eT 
4 ve 


— 


ofw etheito to 2eqyt pix to ¥ 
.stsm ote s7oatdue aft to: tet st 98 anstolaklo LA : 
anoitelugqog olsmat oF cisely icici tami sbulowng YBa é 
soasinegxs bos gataisxt Leups to tor 978 aretommi£) 16) i 
tscintls to vbete & a beskisea mobiea nasd oat afd iiguode oe 

aia fo tostto ybaaimrerabas tuck atdtentg. 6 ef evans -tttemegbut 
-Dieeval of eidieeog ai tI agaibat? daugenatcgie wthhidesilexensg 
ek este signee psivinifo tud emstoinifo ee 
-setsdo tenetsinklo io arostie sit ih « Se diaanial =e 
.ajitaicetosysiia toefdue to eamamegbut mo ask s 

maut aeliteaq saitadniemer ‘amehsinile to 5 2k keaton ad? 42. 
tasy sit at selitesq beter yadt sroddw Htolti bros: oo 
Io apsodiesw sidleeag § Ba vei GI - some ‘BE 3 
mL boensneng Axow TO sins 989 ei xelsb petgananp 
dit tdieaeg old} os no fo i Baan ae owt 9 


v 


nite & 


CHAPTER: IV 
RESULTS 

In this chapter, data pertaining to each of the clinicians by 
factor by rating condition interactions are presented. Results are 
organized by factor and are presented for each of the three 
clinicians in each of the three assessment conditions. 

Definition of Terms 

Since several terms will be used extensively in summarizing 
data analysis, a description of these terms, as they apply specifi- 
cally to the npesent study, is given below: 

F Ratio. Since the design utilized in this study involves a 
one-way analysis of variance with repeated measures, the ratio: 
F = Mean Square Treatment/Mean Square Residual is appropriate 
(Winners ade72 op.°267) 

Significant. Alpha is equal to .05. 

Reliability (KR) a Thesreliability eoekiiciente(R) hrs a 
simple proportion which represents the proportion of obtained 
variance that is true variance. For example, if R = .80, it means 
that 80% of the variation in the measurements is due to variation 
in the true score (real differences) with the remaining 20% 
variation due to error (Ferguson, 1971). 

Unadjusted Reliability - Single Source (Rl). The reliability 
of one estimate by one clinician of a single factor. 

Unadjusted Reliability - Pooled Source (Rk). The reliability 


of the mean of the pooled or combined estimates of a single factor 


28 


yd sastohiite edt 30 sfase ot aes anh ‘anal i 
ord etuinanl .tetieeeag ons snoidosssini aoitibaes obser wt aaa 
veal? vit to dose sot bezaeestG ous brs gorset ed besiege 
_agxettibass tasmeesees gotdy sft to dake ak eretokalis . 

oe 

ankeiavamive si yleviaasixs beav od LOW anit fptovse eorke ~~ 
attineqs yloqe wedt es , emis seit jo motratnoaeb 5 -ePeyisns siteb 
-wofed nevig at , ybute snseenq edt ot yitBo 


antral Yo sols inked 


6 zeviovith vise etd? xi bestitiy agiesd sot aomtke ose FT ; 
do keep edi ,petiusen butnagst dziw sonsitey to stevisns yaw-on0 


etéiaqoxqqs ct Leubteot stsupe apsM\Soentss iT sieupe asot = 7 
(ves .q SOE erat 
20. of Dsees ef sigiA Jopottinghe “oy 
Bp st (2) HEATH S00 ywiitidsifar si7 Gi) yriider Sem © - AN a : 
Hanintdo to wottheqen sat atmaasngey ddidw nobtreqorg sigue 
samen #2 .08. = 9 2 ,olqusks ToT -conciuav outt al tend eoaelasy fF 


sOt¥bieey Sr lads) 2 azhohawieasm odt at notteldsv sft to foe sede 


#0 geatnisnes edt ditto Ceporana?tib igar) atase wit nk 
ALVEL, -ctorugret) dore.o9 - — oi 


29 


by one clinician. This is frequently referred to as the Spearman- 
Brown reliability measure (Winer, 1971, p. 286). 

Adjusted Reliability - Single Source (R*1). The reliability 
of one estimate by one clinician of a single factor after removal 
of mean differences between rating conditions as a source of error 
Winershlo7l, pi 290)4 

Adjusted Reliability - Pooled Source (R*k). The reliability 
of the mean of the pooled or combined estimates of a single factor 
by ane clinician after removal of mean differences between rating 
conditions as a source of error (Winer, 1971, p. 290). 

The adjusted reliability coefficients R*l and R*k are concerned 
with pegging or anchoring of the mid-points that a judge or rater 
appears to be using in estimating performance or ability on any 
given factor or trait. For example, if judges grading ten examination 
papers maintain essentially the same rank order so far as their 
grades are concerned, but differ in the actual values they assign, 
the use of an adjusted reliability estimate just described may be 
appropriate. The reliability model which removes mean differences 
is used when both means and variances are an important interpre- 
tation consideration from the perspective of error sources, 

In discussion of reliability in this chapter, the adjusted 
reliability (R*1 and R*k) will be used predominantly, although 
both adjusted and unadjusted reliability estimates are presented 
in table form for reference. For purposes of the discussion of 


convergence, each of the reliability estimates just described 


corseaimbes oe? stibery espbof Fi ,slqmexe tol Fist? 10 Neti” 


pre coe sfgoie 2 he nebtaio oa Savi 
aov1a 0 Ssotv0e 5 2k sdoi's times audi agowted avons 
(088 sq eLVer ss 


yrilidsiles sAT f 
xotabt sipmie & ip es%emites assets 10 patcag str cm a 2 
pobter nsewrsd eentetetiib nsem to fevome sade errs 
(00S .q .INEL ,wenkW) wotis 20 somos & SB) oiodethbede 

hentsones sTé A#h bas 18% etasloittevo Viilidsiier betenfbe edt : 
aot6% oO epbut » tsi¢ ataiog-bim oft to yiivoruag to gntegeq dviw 

Yate no YFilids so sSusmotieq gnitsmites al gate 9d oF» ses 

sted? 45 ast ce asisvo ines amsa silt ylivitasees atssoisa sii f 
+agiees yedt eguley {sivas ody of a9ttib tud ,bearsosoo exe aeberg ' 
od yem bediaseok’ tay otsmitee ytitidailey Satautbs as to eeu odd 


290mate23 ib. anem sevomes doidw febom yeEitdsztes elt -steiagoxggs 

BS toetsoqmt #8 938 georelwev. Sos enbem Atod soit boaw eh” 

rE. ‘eli evitoegess” uit mort roan matted 

beter [fs odd aia, aidt ai yobiidel tes to iicldarichddl aie 

aed iets pn tbat a ais Lita ( bas LAR) yet 
tuto ne — bisa betevtbenu bas | 


(R1, Rk, R*1, and R*k) can be considered as important and will be 
presented within the context of interpretation for each factor 
individually. 

The relationship between Rl and Rk or R*1l and R*k may be 
expressed as: Rk = 3R1 (1 + 2R1) or R*k = 3R*1 (1 + 2R*1). This 
means that as Rl or R*l approach one as an absolute value, Rl 


approaches Rk and R*1l approaches R*k. 


30 


api. ton 
ar 
écal 
seebahed 
as . 


i x0 ace a vive 


sk (L888 + £) PeHe = tas & 
i> : . dite 
if ,sulsv etuloeds 15 2s ano oso ata 9 ent 


a} 


aT serosa 9 a 


ah 


Factor 1: Intelligence 

Factor 1 has been defined as "the basic ability to learn and 
understand" (Appendix 1). In this study, aspects of this factor 
are sampled by clinical interpretation of psychometric tests such 
as the Wonderlic Personnel Test, Watson-Glaser Critical Thinking 
Appraisal, and the Differential Aptitude Tests (Abstract and Verbal) 
as well as by interview expertise .t 

Table 1 presents the one-way analysis of variance with repeated 
measures (ANOVA) performed between the results obtained from each 
of the threé assessment conditions for each of the clinicians 
individually. As is evident from Table 1, there is a significant 
degree of parallelism between the results obtained in each assess- 
ment category. This is true for all three clinicians. None of the 
F ratios obtained are sufficiently large to warrant further 
between-groups comparisons. 

Tables 2 and 3 summarize the reliabilities, means, standard 
deviations associated with Factor 1. As would be expected on the 
basis of the previously mentioned F test, there is a marked similarity 
in both the means and standard deviations of the scores in each of 
the three assessment conditions for all three clinicians. 

Clinicians differ markedly in the reliability of their 


decisions made with respect to levels of intelligence. Clinicians #1 


on each of the 18 factors, the tests which are indicated as 
being clinically combined for purposes of measuring these factors 


are as indicated by the three clinicians. 


dials ania ag2sid-noatew , 207 ee | 
(fed~s¥ be topetedA) atesT sbutizgh Lstinscsttid edt bas, eet 
1 aateseque weivistai yd 26 ifaw as” 


bataggst dtiw poareen to siayilens ys ene edit etusastg I eldsT 


dose mori beatstdo efivesx1 sd} moowtod bsarrotisg (AVOMA) eo twesem : ah, 
| etsbotalio sft to tiase tox, enottibacs tnemazo2es Soult edt Io | “4 

tmeottingte e@ el owed? .£ sldsT movt tnshive zi zh .ytlewbivibar ve 
-seeees dose mit benistdo atiuesx edt asowtad nei folisved to sargeb 7 
ent to eno .emsloinito dts ffs —ot suxt ai etdT .yworstso ee | ; 
rod rer Jisieese oF epxei vitasiottiae evs bemtetdo eoltet F - 

| .enoeiasgnos equotg-neswied = 
basbasfe ,ettpem ,2eitilidsiley sit sxinsamue > ses S 2efidseT . F 
edt ao betoegxe ed bivow 2A .1L r0toeT ftiw betsiooges ihe 


viiaslimie betasm 6 ei oteds ,sest T bemotsasm ylavolverg et A0° stead : 


ated? to ytilideitey sit ai yibetcsm tttib easioinis9 
a ensioiniid .sategiifet:i to elsvel ot tosgees iftiw sbsa enotehoeb 


es batsoibat exe doinw etast edt , atotos? i ait 40 done tot! 


etotos? scons yaismessm to ae2oquiq 0? benidmoo ylisoinifo gate 
-amekoiails sett ot sh Nelthed| PB 


and #3 obtain a single measure reliability of approximately .50 with 
a pooled source Rk greater than .70. The single source R for 
Clinician #2 is so low as to cause concern for purposes of predic- 
tion. Even when one pools estimates (R*k), a value of only .32 is 
obtained, lower even than the R*l for either of the other two 
clinicians. If this R value is in fact typical for all occasions, 
one should anticipate a low predictive validity of intelligence 
ratings made across information sources for Clinician #2. One 


might expect predictably unpredictable predictions! 


. mt om " nega 


ee ES See en pa 
sonegillotat to ytibilsv evitotheag wet © 956 


: A 4" ay 7 o/.- ae : eS re oe 
en0 .S% gstoini fo got eeouuc slate le pd See 
, BaP, 4... ee 


fanoltoibsig aid 


TABLE 1 
One Way Analysis of Variance with Repeated Measures 


Factor 1: Intelligence 


ee ES SSS SS SSS SSS eS A SS 
I I a oS SE SSS a SSS ss SS SS SSS SSS SSS 


Rater Source of Variation Sums of Squares df E p p* 


ig Between 20.18 19 
Within 12.00 LO 
Treatment kaze 2 Panallfe: ws eG 
Residual 10.77 38 
Total Sy 5, lite} 59 
Mme erin he OCR Ls Ay Ai a ee, oe ee on Be ie ee 
2 Between 9.99 23 
Within S56 Se! 48 
eirec coment 1.78 2 SRO2 06 09 
Residual 13.56 46 
e 2 Vial 
Total 429 
3 Between 29.43 29 
Within 18.67 60 
Treatment eral 2 Pesala ns a6 
Residual T7s40 58 
Total 48.10 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


33 


Be Ou.TL | feublees 


ee OL. ie 
taieies rot sae Wee esis d oid 033 to evita piiaacnaead ail 
ae col ‘ ots Le e109 2 yroms, eeoasinsveo 
‘ | 7 


TABLE 2 


Unadjusted and Adjusted Reliability Estimates 


Factor 1: Intelligence 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
Fe eS ee ON eS ee eS eee 
ul Single “46 (RL) -48 (R*]) 
i Pooled -72° (Rk) -73 (R*k) 
2 | Single «11 (RI) oL4 (R*1) 
2 Pooled -26 (Rk) -32 (R*k) 
3 Single -43 (R1) 44 (R*1) 
3 Pooled -69 (Rk) -70 (R*k) 
TABLE 3 


Means and Standard Deviations 


Factor 1: Intelligence 


Clinician Rating Condition Mean Standard Deviation 

4 Interview 4.40 .58 
z Test 4.20 .81 
L Combined 4.55 71h 
2 Interview by aril 54 
2 Test 4,37 Au /o) 
2 Combined 4.71 45 
3 Interview 4.50 .50 
3 Test 4.53 ~76 

4,27 Peso 


3 Combined 


34 


(req) HH, 

(48a) OF. 
€ Sisat 

anottetved byabaste bas amean 


eoreagiffotal iL sotosi 


e 


nottéivsd basbnet2 


moksthad? anited 


35 


Factor 2: Common Sense 

Factor 2 is described as "the degree of ability to reach 
quick, practically effective decisions about uncomplicated situa- 
tions where sound judgement depends primarily on accumulated life 
and work experience, established precedent and procedures, etc." 
(Appendix 1). In this study, "common sense" is sampled by the 
clinical interpretation of tests such as Management Aptitude 
Inventory, California Psychological Inventory, and The Test of 
Practical Judgement in addition to interview evaluation. 

Table 4 summarizes the ANOVA pertaining to Factor 2 for each 
of the three clinicians. For Clinicians #1 and #3, the differences 
in the diagnoses made between information sources are not signifi- 
cant. For Clinician #2 the differences in the diagnosis made 
between information sources are significant (F = 44S, a2 .02) 
and individual comparisons between groups are warranted. A 
Newman-Keuls multiple comparison between the three means (Winer, 
1971, p. 217) indicates that the mean of the interview group is 
significantly greater than the mean of the test group and that the 
mean of the combined group is also significantly greater than the 
mean of the test group. There is no significant difference 
between the means of interview and combined groups for Clinician #2. 
It appears that subjects rated by Clinician #2 were rated signifi- 
cantly lower in the test condition than in either of the other 
two assessment conditions. 


From Table 5, we see that these mean differences between 


Bh a ape no ware abireq35 sare 
" ote _aenubevorg bas tnabsostgy ited kates 
ont yd betqmse et "sense sommes" ead ¢ cr 
obuzitga nee 3 anil as Hove etzst to mot 

to szeT off ons, etomtavel festgoLodoyad stmiotisd0 pa o 
moitsulpys ws! ne ot wot7 ibs ai snomogiut Ie 

| dose au? $ MotOBT ot; guste AVOMA ad cosines # pee 
sceabetehe bit = bas 1% cabiokas9 404 .enstokmite soni? 
-Hiiagis td ate 2s0s0e (co? tenmietat nsewted absni nezongstb ef 
obem ebeongstbh si7 oi geodexettih add Sh mpioiakLo ot 


($5, = q , BHP = 3) Jasoitiagie sre geowos goltsmrmtal a 
3 7 


A .Detiberew sts squotgy apowied ancaiteqmos cubivibai Bas 
toni!) asen seid oft deewted aoztseymoo wtiqkxting’ atueit- ssaweit 
at quote weivieini oft to neem oft tedt asteoibar (VIS Er [vel 
ede tedt bas quoxg saot oft to maem sis osdit tetseqg yivae i. thagh } 
eit met xetsexg yltnsofttagie ocis si quorg beatdmos edt nae 
vat 

aT 


sone isttib tasvitimgie of ei saeiT ,quorg test baie o fi 
ss asiointio tot equong benidmen bos weivestal to enteont adit aa 


-iingic bots: eve Sh asiolatio xd bore stosbdue avs ans 
terito std to er ak asilt a seiticd at 


| 7 om on ene 


36 


groups for Clinician #2, although significant, are not great, being 
in the order of .5. It is noteworthy that the standard deviation 
of the test condition for Clinician #2 is greater than that observed 
in either of the other two assessment conditions. The standard 
deviation of the test condition most closely parallels that of 
the combined assessment condition where one might expect test 
results to exert a moderating influence on the interview impressions. 
Reliability values associated with Factor 2 for the three 
clinicians are moderate with R*l's in the order of .40 and R*k's in 
the order of .68. By more than tripling the amount of time required 
for sea evaluation, variance error is reduced by approxi- 
mately 30%. A subject is appraised slightly differently in "common 
sense"' depending on the assessment condition in which he is viewed. 
Particularly with Clinician #2, a candidate might be downrated 


somewhat if seen only in the test assessment condition. 


a sacs rer wate 2 nto. ‘6 Pare 
geet Sosqxe trfein ano ‘oredle sot sibace 3 
,anolseerqui weivestnt edt a0 seeidutine seit ae by 
seis edt sot $ sofosi ntiw bareioozes walla 

mi e'##H bas 08. Ao sebyo aft al e'f*9 dttw etedtobont 
bontwpet emit XO Inuons sit galiqiay nedt ss0m YA «88, pews po st 
+txoxqqe yd beoubes et dons sons isv -nolseufeve te dance * 7 
soma" ai yitmerett rb yitdgife beeisraqs at toefdwe A~ sé de 
sbewetvy et sf doinw nt aoltibnoo smeme2zeacs ont no yabbasqes 
betsiowmob od ddgie stahiinbs 6 .S# nsistaito dizw yiamiiek 
waditbne> troweeseen taet eft nt yoo aese FF 2 : 


a a , ‘? " a0 


' ) a: 


Se 
7 
i . 
=. . : . 
: ae 
5. if ae 
’ 7 v a a : y a 
» wv f t 
os : 


TABLE 4 
One Way Analysis of Variance with Repeated Measures 


Factor 2: Common Sense 


SE SS SS SS sss SS es PSS 
aE OEEE agg... 


Rater Source of Variation Sums of Squares df Iz 2) p* 


1 Between 2.00) 19 
Within 9.33 40 

Treatment 43 2 .93 41 Are) 
Residual 8.90 38 
Total 22.18 59 

ee a Be as ee ee ee Se ee SOD, See ees eee 
2 Between Pra al 23 
Within PEG, Syl - 448 

Treatment 3.09 Z 4.48 P5304 ORS) 
Residual 18.97 46 
Total 47.78 71 
3 Between 47.39 29 
Within 28.00 60 

Theat 2.49 Oy 82783 eO7 20 
Residual 25.51 28 
Total 75.39 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


37 


Bas } 


es ats 


Ledipies 710%, useeaetie souks 


TABLE 5 
Unadjusted and Adjusted Reliability Estimates 


Factor 2: Common Sense 


aa a ae ee eee 
Ce eae RNS Saad SS rae sass el nS Sesser as ypais-srasssiseis gst snnsGsoalpsespdlsmssepir sas enes can pRSNeneEpaen=enie=ntinpansbcansoosemeannbenecusessneonneeceanienorareias 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
i ee ee oe ee ee ee 
z Single -33 (R1) - 39 (R*1) 
ik Pooled -60 (RK) .65 (R*k) 
2 Single -30:(R1) .35 (R*1) 
he Pooled 57 (Rk) .62 (R*k) 
3 Single -45 (RL) .48 (R*1) 
3 Pooled .71 (Rk) | .73 (R*k) 
TABLE 6 


Means and Standard Deviations 


Factor 2: Common Sense 


Clinician Rating Condition Mean Standard Deviation 

if Interview 3.85 65 
mt Test 380 60 
ih Combined +. 00 95 
2 Interview 3.62 596 
2 Tee 8312 88 
2 Combined 3.98 86 
3 Interview 3.83 73 
3 Test cow A 99 

3.43 35 


3 Combined 


38 


(24H) 26. (19).0£. ofgsrie 


(A*a) 25, ir (AH) Ve. bale as 

(a%2) ay. (ia) ay. alyni® € 

(AT) E°. (0) 2%. baloot 7 
3 QIGAT = | 


enoistetved fesbast2 bas ernse¥ : 


sene2 motmod :8 wotoBi 


eS OEE een ny fae nen ee meee AN A a Mao mt 


: ; ; : > : Te oe we j ‘ : 


notesived basbast2 nom noiti&ae) gatts# mekotatio 7 
| 
oo. | eB. wstvastel . ‘rae 
Oe. 08 .£ +eeT L 
C2. os 00.4. berttdmed z a 
a2. $a.€ wetvestat s | 
88. SL.8 teeT o 7 


ag. 82.6 sail Pra 


39 


Factor 3: Oral Communication 

Factor 3 is described by the clinicians involved in the study 
as "the degree of clarity and ease with which an individual 
expresses himself in face-to-face discussion" (Appendix 1). In 
this study, aspects of interpersonal effectiveness are sampled by 
interpretation of the California Psychological Inventory: Section I, 
and by interview evaluation. 

As evidenced by Table 7, the F ratios obtained for each of 
the three clinicians were not significant. Variances within groups 
and between groups were essentially the same. From Table 9, we 
see that this similarity is further evidenced by the close 
Similarity of means and variances within each clinician cluster. 

Reliability coefficients R*l and R*k are not high, particularly 
for Clinicians #2 and #3. Although mean differences between rating 
conditions appear to cancel each other out as evidenced by the low 
F Ratios obtained, the effect of differential rankings on the R*1 
and R*k values is considerable. Particularly for Clinicians #2 and #3, 
the reliability of any single estimate of oral communication ability 
(R*1) is so low as to have a great deal more of the prediction 
accountable for by error than is accountable for by true variation. 

It is noteworthy that, although Clinician #3 indicated that 
he could not rate oral communication in the test condition, the 


other two clinicians were able to do so with results comparable to 


No lamaician #3 did not rate Factor #3 in the test condition. He 


indicated that this was not normal procedure for him. 


\ 


a net xkboagah) skanioa sept or a3 iH i 
yd bolqnss ote saamgvitostis Tem an yoda te. bi 


«i moitose : yrotmsval feoihotodsyed ie ar . é, ha 
soit mbt wd Ba 7 
Ro Woes 16% bontetde ecitet T edt ,Y efidsT As boonsbive om . 7 
aquoty aidfiw geonsitsV .tapsibingée tom ecw anatobdets sents oft 
© sfdeT mov? ose odd ylistiaeces sasw equotg aeewee bas 

seols eit yd bennebive sontawt 2) ytiapliate eid? 2ade 32 
wastauth wetoinifo fone aitdtiv zeonsinsy brs emeem to yrbaslimte lg 


yleslustized .tistd yon S15 AA bas / £38 asnelottisos yotlidsifen 


gatvex msewied 2saiaie?lib nbs a a tA , 6% bite. SK anstotak£o x0 : 
7 
WoL oft yd Besaphave| 26 Tuo) Yadto: ings fsonse ot ‘Teseggs enottiBa0o 


f*9 sit ao zexcioinsad Lattastetiib to tostie adt .bedtstdo cobtad 1 7 


.£4 bop Chl annteiqito 462 Mindidoitze? .eidswabtedon af coulev AM bas 70S 
ot 7 


nolfobbety pitt To} isnot sie ba & svsd ot 25 wol o@ ef (oh) a 
oltstasy sunt yd yot sl {eendonos at obds. 10T xd aot sidstayooon - 
in 7 


ted? beteotink 6% nsioiniLo ainda tedt qitaowston et 4D , 7 
eit .aoittbaeo geet edt A nk doi i a ail se3e1 ton hives i : 
ot aimbot essa dino pe Biwi oe ietotatio ows 9 ie AS 


si vwoit iba v9 ay 91 os et ets mi 
a ai 1 on Ton! ale ety tds 


ysilids Rolteo Hmnmmag fsno Yo stamire9 Slgnte yas to aie! 


40 


their ratings in the other two assessment conditions. However, 

it does not appear that test information regarding oral communication 
exerts much of a moderating influence vis-a-vis the distinctions 
between interview and combined scores for any clinician; they are 


highly parallel. 


| _adttsobaummos Leno bothakgew’ oF SK ail taet ter: 
7 : 


ne 4 
enotsontseth ‘oft etv-¢d- eet 


a 
eulint galte 
ots vent vabbointlo yas ‘tot eI IE 


o 
TN 


TABLE 7 
One Way Analysis of Variance with Repeated Measures 


Factor 3: Oral Communication 


SR a ere eS erase lps SSSSesessrpasueelbeesonesconisiusssisncipusmenensasn-ciaas 
a ee Sea 


Rater Source of Variation Sums of Squares df b Pp p* 

AS A A 
ak Between 1G.73 19 
Within MAK (OKG, LO 

Treatment ROS 9) 205 95 Oe 
Residual 11.97 38 
Total ASW GL 59 

So ee a ee PO ee, Ls Bt 
2 Between OD ay, 23 
Within PX) 5 hs} 48 

Treatment .98 2 47 .63 .50 
Residual 28.75 46 
Total Slo ale 

Py ae A a A A a i, ce Ah oleh Se 
3 Between 20.60 29 
Within IA S(OV0) 30 

Treatment OW AL .16 .69 -69 
Residual 11.93 29 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


41 


TABLE 8 


Unadjusted and Adjusted Reliability Estimates 


Factor 3: Oral Communication 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
vi Nena Rescind Sean RE nee eee ENS ee es eed dae Ss POE Se Tl 1 ee I EE Te 
7 Single -39 (R1) -39 (R*1) 
1 Pooled .66 (Rk) .64 (R*k) 
2 Single .16 (RL) .15 (R*1) 
2 Pooled .37 (Rk) .35 (R*k) 
3 Single Sz) ~27 (R*1) 
3 Pooled 4h (Rk) -42 (R*k) 
TABLE 9 


Means and Standard Deviations 


Factor 3: Oral Communication 


Clinician Rating Condition Mean Standard Deviation 

ua Interview 8305 sah 
Als Test 3.60 -66 
ak Combined 3.55 270 
2 Interview 3.46 .86 
2 Test 3.62 .90 
2 Combined 3.67 width 
3) Interview 4.33 91 
3 Test Tae co 

Le 5a 


3 Combined 


42 


(f%3} a1. 


(A*H) 26. 


(1%) YS. 


(4%q) SH. 


7 x . j ; oo 

enoitaived brabasse brs 2anpeM 
A 7 

noitszioummed Isao. :& rorDEy 


ote 


fottsived bishast2e 


— — 


43 


Factor 4: Self-Starting Work Drive 


Factor 4 is defined as "the degree to which an individual 
characteristically keeps himself continuously occupied in work 
related activities without need of stimulation from his supervisor" 
(Appendix 1). In this study, aspects of this factor are sampled 
by an interpretation of the Management Aptitude Inventory, 
Vocational Preference Inventory and California Psychological 
Inventory subscales, as well as by Interview evaluations. 

Table 10 summarizes the ANOVA pertaining to Factor 4 for 
each of the three clinicians. As is evident, significant Foratios 
were obtained for Clinicians #1 and #3. In both cases, a Newman- 
Keuls multiple comparison between the respective mean differences, 
indicates that the mean of the interview group is significantly 
higher than the mean of the test group. For Clinicians #1 and #3, 
it seems that candidates impress as having more self-starting 
work drive when assessed by interview than when assessed by tests. 
There is also more variance in rating this factor in the test 
condition indicating that interview ratings are much more tightly 
clustered around the mean values (little inter-individual variation). 
R values are acceptably high with R*l1 accounting for approximately 
50% of the overall variance in all cases. R*k, which combines 
estimates from all rating conditions, improves on R*l by approxi- 
mately 20%. In practice, Factor 4 could probably be rated by any 


single method with acceptable results. 


a 


Leubivibat ne told ihe eas b | 
Yrow ak Oh TER no: a 
it 38 eid mort na tae ian 


.enoitsuleve “weivistal yd es Ilew an | oleinenaale Ve 
not & totasT ot eriatets4 AVOWA ont nes besmm2 of olde? rn 
20isex I thsottingie ,tasbive ai BA eaeioiatts | setdt ont bd dose | r 
-temWeVl &,eezso d3od cl Ey bre OF enstotnt£[s ‘20? benistdo 5 ore “4 
~290deIsttib masm svivosgesy sit aaawied HOBLISGMOD sigitinm elust ; 
yltnsstitmie <b quevg woivastat eft to assem edt tedz asteoibat 
<£4% bus LX ensiotatio soit iquotg test off to abem oft asd} vedgid 
gniivste-tfee som paived as-eestgm setabhifiso tads amass ti 
.2teot yd beecerzes asdw cert woivisital vd Seezsees now avich 10M _ 


szot sit ai yotos? sift gaites ai soteivpy exom oafs et ered? 


yitdeis ssom foun S45 sgaites cae tedd gnitesibat acizibaoco — iu 


.(nottsitay Lebpiitbonth Nadede on esulev msem edt Savors “a - 


pian fetiam 2 


TABLE 10 


One Way Analysis of Variance with Repeated Measures 


Factor 4: Self-Starting Work Drive 
Re a ease Sse Sse cssninssenepsunaes mess opacpenscosi oopeeememaonioiancoesnneeinaseeoees 


aceon egg ee Ee ae ee 
Rater Source of Variation Sums of Squares df F p p* 


al Between 37.52 19 
Within 23500 40 

Treatment 5.20 2 5.45 .008 03 
Residual 18.13 38 
Total 60.85 59 

ee SOLAN eg 3 eS Oo aa ne 
2 Between 43.99 23 
Within 21230 48 

Treatment Zao 2 2.64 .08 cee 
Residual 19.14 46 

pane = lotalintebulew Bo. Oe iyi ARE ees no. 
3 Between 60.49 29 
Within 46.00 60 

Treatment gi Ou os sO, 001 009 
Residual 36.38 98 
Total 106.49 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


KY 


— — . - a _ — — - ines 
se i ee MER UES 
ss he ¢ 


7 se 


reese ee 
icakyil hy ( 


TABLE 11 


Unadjusted and Adjusted Reliability Estimates 


Factor 4: Self-Starting Work Drive 


Clinician Source Unadjusted Adjusted 

Reliability Reliability 
0 Single -44 (R1) -51 (R*1) 
1 Pooled .70 (Rk) -76 (R*k) 
2 | Single -52 (R1) -55 (R*1) 
om Pooled 77 (Rk) -78 (R*k) 
3 Single -36 (R1) — s SeCRRL) 
3 Pooled | -63 (Rk) -70 (R*k) 

TABLE 12 


Means and Standard Deviations 


Factor 4: Self-Starting Work Drive 


Clinician Rating Condition Mean Standard Deviation 
1 Interview 3595 267 
1 Test 3.25 1:18 
1 Combined 3.45 97 
2 Interview 2.92 64 
2 Test 3.21 1.08 
2 Combined 3.33 1.03 
° W 
3 Interview 3.90 0 
3 Test 3.10 1.04 
3 4 PAI} 
2 Combined 3.47 1 
De ee ee 


45 


(449) Ot. 


(tea) €2. ” (55) 84. 


ORR) ay. (it) YY. 


(rag) oe, (£4) 9&> 
(es) 08. (in) EGs 
SL) AIGAT ie 
2 a , or Vide; 
enoiteived brebasze ots aneoM se | 
. er) 
svind ocoW yattrste-tise :# yotost mitt i 
moivetved byebrere it8al noth fared goitsa rn. OF. 
ab Oe ieee ees 
va, ae.e . wskvastal 
h 
Ved i ze ike 2 t : te a relied ; : 
48. $2.8 wolvasiinl 
| cs 
BOLL > a 
€0\=L5 : 1 
OY, 
a2 @ 
envy 
wos 
cY a a a - 


Factor 5: Interpersonal Effectiveness 

Pactor 5 is defined as "the level of effectiveness the 
individual demonstrates in day-to-day dealings with others with 
regard to gaining and maintaining their respect for his ideas and 
opinions, their confidence in his integrity, and their general 
feeling of good will" (Appendix 1). Aspects of this factor are 
appraised by the California Psychological Inventory, Vocational 
Preference Inventory, Edwards Personal Preference Schedule, 
Management Aptitude Inventory as well as by interview evaluations. 

From Table 13, we see that significant F ratios were 
obtained only for Clinician #2. A Newman-Keuls comparison between 
mean differences indicates that the mean of the interview rating 
condition is significantly higher than both of the other two means. 
Subjects are rated significantly higher in interpersonal effective- 
ness during interview than when they are rated in either the test 
or combined condition. It seems likely that test information 
exerts a moderating influence on the interview evaluations when 
the combined rating is made. Combined ratings more closely parallel 
those of the test condition with respect to the pegging of mean 
values. 

Although the results for Clinician #3 do not indicate a 
Significant F, R values are very low. This indicates that, although 
deviations made over the total group within conditions appear to 
cancel one another out, ratings of individuals between conditions 


vary greatly. Even the R*k value of .36 is only at a level equal 


edt 


: eae 
ee iste 7 Z 


_anoitsulevs weivreini vd os fiew es ees sbusitaA Saneitiaaal nf 
exow colts: % tabol tiagte jsix 992 ow, OL side? - a 


tanoligooV ,yiudnevel Lsotgeteroyed guest i 


_ siubsdod Ganaveheat lgace1ed nine 


nseuted Tozixsqmoo elus%-nemawel A «Sh ps boin ito ot yiaa fekounaal 


gtitsd wetvrorni odt to ceom ods geet) 2etenibri sesonetetiip asea 
i 7 

-ensan owt tedto edt to, ited seat tangid yltnesi diate at cokstbao 
- 

-ayttostte Lonoewatotor mt wedgtd yitassitingte beta ste tiki) 
4 4 3 


test wit, totitte np/batext 5x5 yortt god merit iseiaial 
nortawrotat und ret vianet. ames2e 31 aowttges ete 
sath saoidsulevs webvsotay, mi ng” sduet#i?ht yutterebon 5 etaaxe 
tefistsg yiszots Si1om 2anitsy bemtdand -Sbem @f gniiet beaidmas ae, ' 
sem to yttbgon' Sd? oF teedeer ditiw nottibaos test eit %9 esos * 
| pie F 
p oisclbmk tom ob 6 msfotal£) 10% ediweba eit dauodiniA  / 
dgquoisis , ted? setsoibat ost aa a 916 eauley 2 i tas 
oF sseqqé ea0i2ibmoo cast em ta203 of nee sbam exe! 
sii thbaos coawted ee if hig 
leupe Level s te ylho 2 abe 


‘1a 


47 


to the R*1 value for the other two clinicians. More than three 
times the effort for Clinician #3 is required to match the relia- 
bility estimate for a single occasion for each of the other two 
clinicians. One should anticipate inconsistent predictions on inter- 


personal effectiveness for Clinician #8. 


TABLE 13 
One Way Analysis of Variance with Repeated Measures 


Factor 5: Interpersonal Effectiveness 


a cs se a 
nT SS ee ee Ee SP ST 


Rater Source of Variation Sums of Squares df ie Pp pr 


1 Between L6uG7 19 
Within L467 LO 

Treatment 63 2 . 86 243 .37 
Residual 14.04 38 
Total 1.33 59 

epee ane sOCdL eee! 5 ery ee Ee oe aa 
2 Between 22% 23 
Within 19.33 4.8 

Treatment 4.08 o 6.16 .004 202 
Residual 15.25 46 
Total 41.88 71 
re Between PALE MS) 29 
Within Doig ANS! 60 

Treatment Dd, WWE 2 DEMO) ola Ly 
Residual 27.18 58 
Total 90.49 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


Ree oa ae 
ar ioe pee, 
™ oe i a x 

inh, Fh Seta pe 


- 
2 rE EC a aR ERTS | airs 


_ Bhs ry a. “wy —. 
i ¢ 3 coe re non 


VG. eH. da. Ss 


Eas 
ge N0.4f Leubleok 


oS LS 


TABLE 14 


Unadjusted and Adjusted Reliability Estimates 


Factor 5: Interpersonal Effectiveness 


ee ee Ee ee ee eee Se 
a ee ee ee eee eae ae eae 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
Sr ae ee a tee ee ee ee ee 
i Single 722 (RS .31 (R*1) 
Bh Pooled OSCR) .58 (R¥k) 
2 Single roa (RI) .39 (R*1) 
2 Pooled 59 (Rk) 66 (R*k) 
: Single -14 (R1) -16 (R*1) 
3 Pooled -33 (Rk) .36 (R*k) 
TABLE 15 


Means and Standard Deviations 


Factor 5: Interpersonal Effectiveness 


Clinician Rating Condition Mean Standard Deviation 

1 Interview 3.35 79 
l Test 3.45 .80 
al Combined 3.20 1 
2 Interview 3.54 76 
5 Test Se (O0) 76 
2 Combined age 64 
3 Interview eos i 
3 Test BiG 36 74 

Sia 67 


3 Combined 


49 


(A*A) §e, 


(L498) C8. 


(f*4) 2d. Ga) ee." 


(14s) af. (£4) the 9 


(1H) 8. (iA) SE 


stoltsived 


ecousyrtostia fi 


moitsived Hrsabaste 


_— 


— 


ev. 


Oe. 


te, 


OX. 


50 


Factor 6: Leadership Force 

Factor 6 is described as "the amount of influence and dominance 
the individual habitually exerts over groups and persons he 
encounters" (Appendix 1). Aspects of this factor are appraised by 
the California Psychological Inventory, Management Aptitude 
Inventory and by interview evaluations. 

It is encouraging to view the results from the appraisal of 
leadership force under each of the three different rating condi- 
tions. Not only are the F ratios small, but reliability measures, 
in both the individual and pooled cases, are encouragingly high. 
Leadership force appears to be rated symetrically both between 
and within rating conditions. Further, there do not appear to be 
any inter-rater differences with respect to the ratings of leader- 
ship force. Means, standard deviations (Table 18), and relia- 


bilities (Table 17) are highly convergent for all three clinicians. 


Sats 
.etstoinifo seit Lis yt so gIsVACS yidgid sag (VE efdeT) ask: 


molts: Rities wot 


q ow 


baa 2 eaeede 


-fbros ‘gzijs1 tirexeiitih te edt te ose sobeu Saad 


in uy 


to [ssisrags st mort atiue 


,usmpasom vritidetfe: tus , Elsi nade 2 ‘J ont outs “ao toll 
git yignigsavoons sis < 20289 Be foog bas routes — 


on, oP 
nsswied dtod yilsol tIaitve: tet ed ot 2te9qge sort ¢ 
fey : i " 


ed of tseqaus tan of sredt esti .acoitibses aniter pase 6 
-dabsef to egatsey edt oF sca Hiri. sqanexadtss eid aa 
-siiery base ,(6£ sidsT) enotteivet Sisbiste ,adnseM 99308 ae 


>> 
7. 9 


TABLE 16 
One Way Analysis of Variance with Repeated Measures 


Factor 6: Leadership Force 


[aan gE as aaa es ota ae eee ee ee ee 
a a ee 


Rater Source of Variation Sums of Squares df F Pp p* 


u Between 34.40 19 
Within 15-733 40 
Treatment 63 2 =e2 24k 38 
Residual 14.70 38 
Total End 59 
(oie RES ee ee ee ee a ae PY ee Sereno © Seer Se Re 
2 Between 48.44 23 
Within 23239 48 
Treatment sips ks 2 1.24 . 30 a20 
Residual 22014 46 
3 Between 71.96 29 | 
Within 36.67 60 
Treatment 1.16 2 94 40 34 
Residual 35.51 98 
Total 108.62 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


pill 


TABLE 17 
Unadjusted and Adjusted Reliability Estimates 


Factor 6: Leadership Force 


Clinician Source Unadjusted Adjusted 

Reliability Reliability 
1 Single 55 (RI) .55 (Rel) 
1 Pooled .79 (Rk) -79 (R*k) 
2 Single -93 (R1) -53 (R*1) 
is Pooled -77 (Rk) -77 (R*k) 
3 Single -50 (R1) -90 (R*1) 
3 Pooled -75 (Rk) -75 (R¥k) 

TABLE 18 


Means and Standard Deviations 


Factor 6: Leadership Force 


Clinician Rating Condition Mean Standard Deviation 

a Interview 3.15 a7 9 
a Test 3.40 1.02 
1 Combined 2425 .89 
ie Interview 2.83 .80 
2 Test 2.87 1701 
3 Interview 3.20 1,01 
3 Test 3.47 a2 

3.40 1.14 


3 Combined 


&2 


(fa) £¢. 


x95) XY. =a SS: 


(LAT) 08. (ist) 04 
P22) Ke (13) <%= 
8f SIgAT 
enobteived iS ietaese bas anaolt 


so1o7 qisevebssd 20 sotost | 


nottsive® bishast3 


Factor 7: Self-Reliance 
Self-reliance is "the degree to which the individual carries 
out assigned responsibilities without seeking direction, help, 


encouragement and/or reassurance from co-workers" (Appendix 1). 


In this study, elements of this factor are assessed by interpretation 


of the Edwards Personal Preference Schedule, California Psycholo- 
gical Inventory, Management Aptitude Inventory, and by interview 
evaluation. 

Table 19 summarizes the ANOVA done with respect to Factor 7. 
As noted, significant differences between means were observed only 
for Clinician #2. A Newman-Keuls multiple comparison of mean 
differences reveals that the mean of the scores obtained from the 
interview condition is greater than the mean of the scores obtained 
in the test condition. Subjects were typically rated higher in 
self-reliance in the interview condition. Once again, for 
Clinician #2, test results appear to moderate interview impressions 
since the mean of the test condition is not significantly different 
from the mean of the interview condition. 

Reliability measures for clinicians vary considerably for 
Factor 7. Both Clinicians #2 and #3 obtain R*k values which are 
less than the R*1 value obtained by Clinician #1. With R*1l equal 


to approximately .25 for Clinicians #2 and #3, one might expect a 


considerable difference in prediction dependent on rating condition. 


53 


“wotadot ‘absetisgh ee sousastons cence ab " 
Watvistat yd bas ewtossaval aie mona sine 


ae PD _ 
ae , 


7 
; ieee —— 
.\ s0fsBt of tosurves dtiw onob AVOWA sat mstsediaich eL dat yi 


yllia havasede sisw eansan aoowied ssoaexettib: a ee roe 
sem to mosiregmo2 sighs tum ape -nsawe ASH ho var 7 

ecg) moet, BESTS se t0on Silt to. ‘gben oi? ted} alsevet soonest 
heaiaae eeyoo2 sit to msem ont edt tstes Ty et noittbaos wien, 
ri xsieid heted yt leoiqys ona Stosfdc2 et teisahe vest wth ak 
nod . diBaE Sond soot tomes weivrstis ne 
snoseesrgatl weivestnt staysbon os ABeqye otineas tat “ea 
tao%xo2P bb ae ee’ tage gon ¢i nots ibnos tags ont 0 deem pay 
shy Heros wsivastat ae a ceo kad 


4 


x62 s iclesh nena yisv ‘eustotatlo x62 . =) deifeA — 


ers doitw ceuley X81 niatdo £% bos oh nae dich . 
ieups DA daw sy eee xd beaiesdo ee SA ate ~" 


ae a 
) end ee, et a dati tL “x02 ¢8 4 het mk mS 
. er ee oe gy Alan 
/ OL: | aE a zy co ys [i 


6 hee 


a 


; 


it 


ap 


54 


TABLE 19 


One Way Analysis of Variance with Repeated Measures 


Factor 7: Self-Reliance 


ee 
= a rer SES FT i 5 a i PET LSS 2c SE 


Rater Source of Variation Sums of Squares df E p p* 


1 Between chai 19 
Within LBs: 40 
Treatment iS) 2 74 -48 40 
Residual Lhd 38 
Total 54.40 59 
eee ee Oe oo oer ae oe, 2 2 Se Se eS Se ee ee 
2 Between 2220s: 23 
Within 28.00 48 
Treatment 4.19 2 4.05 02 .05 
Residual 23.81 46 
Total 50.61 ik 
Seer oe eee. 3 Ola ie terelee — Se eee: te AL ee) Pe ae ee 
3 Between 45.15 p48) 
Within TOA 60 
Treatment 4.69 2 3.00 06 .09 
Residual 49.31 98 
Total OS 26 89 


reer 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


oul a 


Oe, Bea, 


00.82 
20. $6, 30, ef a 
1889) 


7 


ta he AN ia = 


20, 30, 


TABLE 20 


Unadjusted and Adjusted Reliability Estimates 


Factor 7: Self-Reliance 


a eS A ee ee 
a a asap arama a a ee 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
ia ce eens Gens ne, eels eens, jee) RO ene See 
u Single -90 (RI) SoOuGR eds) 
i Pooled «72 (RK) .75 (R*k) 
2, Single -19 (R1) .23 (R*1) 
2 Pooled -41 (Rk) -47 (R¥k) 
3 Single -22 (R1) -25 (R*1) 
3 Pooled -46 (Rk) -50 (R*k) 
TABLE 21 


Means and Standard Deviations 


Factor 7: Self-Reliance 


Clinician Rating Condition Mean Standard Deviation 

A Interview STO 46 
i Test 3.65 ae 1a. 
au Combined 3.45 Lee 
2 Interview 3.46 81 
2 Test 287 66 
2 Combined 3.08 91 
3 Interview 3.50 81 
3 fesse S800 1.18 

3.03 98 


3 Combined 


3) 


et ae 


_ w - sree sian 


bes yeuEEA Betout beat! 
yrttf id ELoa ytitidatisa 


se eee 


 (£*8) Oe. (iT) 02, Signi? . 


ann 


(4%H) at. it (fh) oo. Halos 
(.f%%) &S. Cia) OL. elem 
(f#R) Te. (aa) te. belong 
(148) es, (iH) SS. ofgent2 


SARS) Bee (Ast) oH. . bal 


suoictsitved Bypbbrnere bas 2zaseM 


sons rish=-tise 2\ cvotosT Ay 


= 
5, 1 ' Pe 
noeizeaives hsebasre THOM noltibnaod garish stot Te) 


| os toialid 


A el ec I A eM 
: a —* 


aa. 62s woe tyvratni 
LEsL eg. tesT 
SF. __ e8.8 benidmod 


18. 3 aH. es beret 


ae. Y3..¢ ; teat 


re. | - S68 bantdne? 


weivistal _ 


Factor 8: Adaptability 

Adaptability is defined as "the level of ability to cope 
comfortably with new and changing circumstances" (Appendix 1). In 
this study, aspects of this factor are appraised by tests such as 
the Edwards Personal Preference Schedule, the California Psycho- 
logical Inventory, Vocational Preference Inventory, as well as by 
interview evaluations. 

Table 22 summarizes the ANOVA relevant to Factor 8. As 
noted, no significant differences are evident, save for Clinician #3. 
A Newman-Keuls multiple comparison between mean differences for 
Clinician #3 indicates that the mean of the interview condition is 
Significantly higher than the mean of the scores in either of the 
remaining two categories. In the same manner as was evident for 
Clinician #2 on Factors 7 and 5 and for Clinician #3 on Factor 4, 
the test protocols appear to exert a moderating influence on inter- 
view evaluations when a combined rating is undertaken. 

The significant mean difference evidenced by Clinician #3 
is combined with a low reliability (R*l = .23) indicating the very 
real possibility of differential diagnosis depending on the rating 
condition. For Clinician #1, although mean differences do not 
appear to be a large error source, considerable differences in 
ranking are apparent as reflected in the low value of Relea. c0 
which is independent of the similarity or difference of mean pegging 
between groups. Clinician #2 obtained a R*1 value which is 


considerably higher than even the R*k value for the other two 


Me peg 


at .(L xibasqqé) Fewsis 7 tio » ia td won toi yo 


ips? 2idtt ‘to asoeqes we ce 2 eldt 


inate aad 


en dowe etess yd) bs: skenqde 


-otloyet statotifss ant ,st . ee arene os Innoere? of bnew ods 


4 
Fett Lenoit6o0V iti. rs oigo! 


.snciteulsve weivwstat 
- 


vd 26 Lis» as ,yrvotnsval sonst 


J 
2A .8 rotosi oF tobveisn BK OM » eft esxinsamue CS otdet » 


6% abiotintlD a0? svbe ,toebive S16 eoonompttifb tasoitingie om ,b eae 
"ol Zeousisitib meom ms sowtad . a efebt tom alveN-nem welt 
elt soft7 bios weivrotat edd 30) aeenr ont ted+ 2eF652bat €% ae iotak 


iP ; 

oft to tedtie mi esnose adt sl aa nett vordpir ma 

>| : : 

tot jashive 2AW 26 Tennsm smse Bars ol .esivoystso ows sotats LST 

a ; 

.? ‘xofon? mo EX natoicifd rok bats 2 bas ¥ axotost co Sh astoka 
. e ’ 5 P .~.% 
-gesqi no ssasultar garrsexobom 5s ty9Ks ot -xs9qq6 2fotototq taer 

@ 

; : para = 

.tetatgebou ef yaitedt beatdmos 5s asiw snoiteoteve, Neiv 


\ 7 = = * : - 
€% implotas£Lo yd Sesnebive somstettib nbem iy a a ont 7 
ihe 
vrev oft gnitsothai (€S. = 1*8) yokiidsilex wot 5 ta as xt dim mo al 
7 or : : , f WP ea ‘ . Sly ote —— penne 
gtivey $d ro poi bredash eteouge lo Estrconetttb 20. eitdte aoq. bom 


tom eb esofetettib tcuni Hauody ie elt peers xoT “ot sibew 


ai asogedsttsh amie Gels F Cctace » Re ett 6 at Wee qq 
7 
9 Cte = ma? to spies BY, MosE. oni we ogeato b et 5 16 ie 


untgneg mem to sonets3ttS vo qrivettate: serene ee, gobat 2b Ao! 
hi 4 (gi) fac iae! + beste 3 bene oy “ we 78 on 
ice amy ot oho, 9 402 « sulev x pee im f vitor : 
ie Lu ee 


Uae _ F 


clinicians. His single estimate of adaptability is encouragingly 


high and little is gained by combining all three methods. 


Dal 


TABLE 22 


One Way Analysis of Variance with Repeated Measures 


Factor 8: Adaptability 


a: aa RU eI: Ste tase esse oem 
SS 


Rater Source of Variation Sums of Squares df a p p* 


de Between Lise 19 
Within Loser 40 
Treatment heya) D2 ioe BUA} ae 4) 
Residual 15:43 38 
Total 34..18 59 


2 Between La awl 23 
Within 18.00 48 
Treatment i386 2 PRANENE BOS my. 
Residual 16). 1h 46 
Total 65). ieh 7 
3 Between 41.29 IRS) 
Within SM rates) 60 
Treatment 13279 2 Ss) -0003 .005 
Residual 43.58 98 
Total 98562 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


58 


TABLE 23 
Unadjusted and Adjusted Reliability Estimates 


Factor 8: Adaptability 


ee ee eee 
a Ee le SR Se ELS SS a es PEE ae 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
ee ee ee ee ee ee ee 
1 Single «29 (R1) . 30 (R*1) 
zh Pooled -95 (Rk) .56 (R*k) 
2 Single -60(R1) .62 (R*1) 
ae Pooled - 82 (Rk) . 83 (R*k) 
3 Single earns) -23(R*1) 
3 Pooled -33 (Rk) 47 (R¥k) 

TABLE 24 


Means and Standard Deviations 


Factor 8: Adaptability 


Clinieian Rating Condition Mean Standard Deviation 
i Interview 3.30 64 
il Test 3.10 .70 
L Combined #95 . 86 
2 Interview 3134 .85 
2 Test 3.04 1.06 
2 Combined 2.96 89 
3 Interview 3.90 p07 
3 Test 3.00 - 86 


3 . Combined oa eae 


59 


(13) 8S. 


(£48) 6°. 
(A%A) 
sS SIHAT 
znoiteived hSrsbaste bas eqseM” 
utiiidstqsbA 38 s1ot063 a } . 
=> 
moiteives basbaste ' motsEbaod griseA . 
wo ivvaral 
ov. 
oe. 
a3, 
c0..f 
PS, 
pert 
an. 
Ce 
i 


60 


Factor 9: Potential for Growth 

Potential for growth is defined as "the degree of probability 
that an individual will develop the personal resources to cope with 
increasingly more complex and responsible work roles" (Appendix 1). 
In this study, potential for growth is appraised by a clinical 
integration of all information obtained by testing plus interview 
evaluations. 

On evaluating the observations in Table 25 which summarizes 
the ANOVA for Factor 9, we see that a significant difference 
exists between the means of the three assessment conditions for 
Clinician #3. A Newman-Keuls multiple comparison of mean differences 
indicates that, as was the case for Clinician #3 on Factor 4, the 
mean of the interview assessment condition is significantly higher 
than the mean of the test assessment group. Once again, we see 
the moderating effect of test information on interview evaluations 
when rating in the combined condition. In the cases of Clinicians 
#1 and #2, a high degree of similarity is evident across rating 
conditions; no significant differences are evident. 

Coupled with the significant differences in mean rating 
demonstrated by Clinician #3, we see a low R*1 associated with the 
estimation of Factor 9. Once again, the use of all three methods 
in obtaining an R*k = .65 for Clinician #3 only approximates the 
single source estimates obtained by Clinicians #1 and #2. 
Reliability estimates for Clinicians #1 and #2 are much more inde- 


pendent of assessment condition. 


Saint Winnie ers bh 
AB AibasggA) “asiter: Seow sfdZenazer bas saLqnes ons ds “ 


iwoisilo s xd beeksxqqs ei twor, to? Isizastoq cvoure ai 
eee auiq gniteet yd bemistde aoirsimotnt {ke to. moka 


74 


Sttoi - dee ‘ 
oes vat ied 


seninemaue fioldw eS efdsT al gine ceed ont ani tpulsva ah 
eokerstith iasoltiosie s tedt ose sw .2 yotoeT ot wie 


s0% ehoitibaes tromeeyees setds osdt to enpam edt me 


oa 
2qarne1esTtib asem to aoaivegios slqitium 2lus-nemwe A eh a 3: : 


edt .# yotost oo 6% salboinilS wot exzsy sit esw es ¢tettt 2: 


qoigid ylomsoltingls ai moltibaos tnsmecsses weivestnk edt to 1 

342 ew ,misgs 9909 .quers tasmecsees Feet oft to agent edt aed 
#uobisuleve wetyyesni go pnclfsmyoini teec 26 taal | 
enpiotniid Yo esesd sit 1i .noktibroo banmitdaqos. it ai ssh 


guiven egoms tnebive 21 ytinsiimie to sotgeh dans s iil 


ie 


* , = 7 *, « . 7 = : 7” 
-Yiebivs 215 ssonsastiib tascitingie om ¢ Emo 1 ibae 3 


. y 


gthrer neon wi esoneis23i 5 seiser3 Linge oy ftiw belq is 
a 
sido a 


ed? iftiw\ Setaloeuets (77 wol 5 ope sw .C% astoimifo yd ber 


ahertsen seit fi te sanstit .aiags acct 42 ‘totes to ac. 


ime aascatnotage Yio Ch, sat eid tot ¢8., F SRA ae 


.S® bate a aneiohibly ye bare Felts =telet 
cae: ee BY: aati etal tit 


at i “i : 5 
' a 
; se _ Fibnoo a hear’ ate 20 ; 
> Vee ie 4 


aor a 


TABLE 25 


One Way Analysis of Variance with Repeated Measures 


Factor 9: Potential for Growth 


i ee 
a ee ee ees 


Rater Source of Variation Sums of Squares df 1s Pp p* 


1s Between Soars 19 
Within IZ OKG; LO 

Treatment 63 ) 1.06 .36 2 
Residual 11.37 38 
Total bi. 73 59 

Bene ee Pee OCA ee eee ee ef ee ee ee ae eee, ee 
2 Between 48.65 a 
Within 205,00 48 

Treatment LAOS 2 IPAS SEXO) .28 
Residual 18.97 46 
Total 68.65 Wa 
a Between 38.99 29 
Within 31.33 60 

Treatment 3.90 2 ela 402 205 
Residual 27.44 98 
Total Wa SZ 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


61 


: cI] 
we Tee ear 7.4 
ef ; a “yi ; 
? + ; : 7 two aot ve Pr z= Khe § : “ - : 4 
he: ag bf 7 
‘ y | | 


= > a2 2 - ie o _— — 
ar i PROS ES ES SHIM FSS See 4 
os = ee etre 


ee =a a ee oe we SIS 


= ay = ° AS : if s - e 
< ae BA i Pa 1 oe 
ig : a ay 3, BUH4 10 Ane. MOLTTSL' 
‘ _ ee ? : a _ ¥ r oe 
t ; - 
7 a a. p 2 ‘ Zz 


ef ev Se 

Oi! og .hi bk . 
SE. ae. 60 al bd, — 

ae ve. LL 


e2 ov ie 


gs 20.84 


@0.08 


TABLE 26 
Unadjusted and Adjusted Reliability Estimates 


Factor 9: Potential for Growth 


SSA Sea aaa aan eircom eee ee ee Ee ee See 
aaa acai atcineaccrtmmetenomececptndeeessonee omer EE EE EEE EE ESE 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
fe Se ee ee ee No 52 ED BN OE ee MET SS 
vik Single OL. CRT) 61 (R*1) 
ul Pooled oo (RK) .83 (R*¥k) 
2 Single .58 (RL) 58 (R*1) 
2 Pooled . 80 (Rk) -80 (R*k) 
3 Single -34 (R1) 38 (R*1) 
3 Pooled .61 (Rk) -65 (R*k) 
TABLE 27 


.Means and Standard Deviations 


Factor 9: Potential for Growth 


Clinician Rating Condition Mean Standard Deviation 

us Interview 2.90 n62 
A Test 2.75 Behe, 
ue Combined 2.60 91 
2 Interview 3.08 86 
2 Test 2.92 95 
2 Combined 3.21 1.08 
3 Interview 3,97 ag 
3 Test eee t 89 

3.40 92 


3 Combined 


62 


betantbA | , betes fea 
yiilidsitet oti tider fof 


(185) Le. (18) 18. 
(MR) EB. (olf) £8. 


(£*H) ec. (18) 8¢. 


(fest) 08, (4a),08. befood 
(£8) 82. (19) #e. eLgnte 7 
(XH) 28, (AF) £8. bateot 


WS, GIZAT \ 
enoiveived Bisbriste bus emsem. PAL 


rc. 


diwowt) vot Isitneios <2 xotosi 


ne ome ee 


nokteived busbiste neat folvibnod grissex 
$a. oe. woivissal 
ee. ey.s teeT 
. 
Le. Bas bY bentdmod)  _ 


a As 


weivastal a 


63 


Factor 10: Readiness to Learn 

Readiness to learn is defined as "the individual's willingness 
to acquire new information, explore new ideas, methods, tasks, etc." 
(Appendix 1). In this study, it is appraised by tests such as the 
California Psychological Inventory, Vocational Preference Inventory, 
Wonderlic, and the Differential Aptitude Tests as well as by 
interview evaluations. 

From an examination of Table 28, it appears that all clinicians 
experienced more difficulty in the rating of Factor 10 than they 
did with many of the other factors. Significant differences 
between rating conditions were evident for all three clinicians. 
A Newman-Keuls multiple comparison paeneee means for each of the 
clinicians reveals considerable similarity in the differences 
exhibited. For Clinicians #1 and #2, the mean of the interview 
condition is significantly higher than the mean of the test con- 
dition. For Clinician #3, the mean of the interview condition is 
significantly greater than the mean of the test condition and 
the mean of the combined condition. Table 30 indicates that for 
all three clinicians, test results appear to be oderedng inter- 
view impressions in the combined rating condition. For Clinician #3, 
this moderating effect is not great, resulting in the additional 
significant difference between interview and combined mean ratings. 

Although significant F ratios were obtained for all clinicians, 
reliability estimates are not so uniform. Clinicians #1 and #2 


parallel each other obtaining an R*1 value of approximately .47. 


1d atcha ot Fae a oe of wei 
“te «ees <ateliven: aiod! vit oreano «naam on ol 
off ee dave eteatovd pantssaae BE SE .yhute aint ot 502.2 : 
-yaotnsvel eonsmeten? LenoltsocY ,yrotaevatl satgofodoyed ail no. 


yi en flew es etesT aboeesaa fsistnesstthd sft bos shiaebaeh 
=a 9 


.eootssuleve wobvastak™ © 


emuiatet(S (is tsdt exceqs 31,88 @idsT to noitentnaks as) mone - 


yee asdd Gf sotont tc giltsT sft ac valuoitiib scom beametasque. 

aSoneweVHh tasviticute \.avotost taito sii to yasat dziw bib. 

-2a6toinilo nivets [le x32 tagbivse: sasw anotd#bnoo 5d 90+/ nominee 

eit 2o doSs 103 eons deswisd méaivequos siqitiom abla A 

aeonatstith soit oi ytinelimke oidetsbiznoo elasver easiniat£o 

valvasini eft to assm edt (SH) bas (% eastoinio aot .beceeeiee 

“05 fest edt to seem adt asdt yorgia vinastvinpte ai qotzibaoo 

at aofsibnos gervanisl ad+ to assm od? ,f% askotntfS 1ot <eeheES 

- bRB fottibres tet sdt+ to nbem ont usdt netestg yisnbokbiegie 
mot J&dt 2otboibai 0¢ aids? .dolttbacoe benidmoo sit to nese Sat 4" 
~jatnt gattessbom sd ot «esqys etivzey tages ,ampioinits sgadt fis” 
et gsiofetio wot .moitibnes gnitss beatdmos ad? of naoieneuqit wolv . 
ishotrirbbs edt iti gnitiuesy ¢seexp ton ei tostte yatsexabom ats 
<> neon beniiimoo bis weivrednl casted gameusttib has. oak) 
istotaifo iis wg benistde svew zoitpe 7. dasoitingts Ngueds Lh 
3 bas OL eaptoisiid mistiau oe joa Sxs eetumitas +h 
ernest: ei Ce ane ama 


- - 7 ¢ 


Clinician #3, as has been the case on Factors 8, 7, 5, and 3, 
obtains an R*l value approximately one-half that of his counterparts. 


His degree of convergence between ratings is low. 


64 


~ S : 


7 
Pes we 
a arnt 


eeyted ona" 


aanoele 


$ 
aS 
7 


65 


TABLE 28 
One Way Analysis of Variance with Repeated Measures 


Factor 10: Readiness’ to Learn 


See 
ne oe ee eh ee 


Rater Source of Variation Sums of Squares df F p p* 


L Between 24.18 19 
Within 16267 40 
Treatment 2.30 Z 3.84 -03 .06 
Residual 13.87 38 
Total 40.85 59 | 
ee ee ep Nek om ee Oe 
2 Between 41.99 23 
Within 26.00 48 
Treatment LARC) 2 4.85 =O .03 
Residual palahy| 46 
3 Between . Shilo 29 
Within Seeks 60 
Treatiene 11.35 2 6.86 002 01 
Residual nae 38 
Tote) 110.45 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


TABLE 29 
Unadjusted and Adjusted Reliability Estimates 


Factor 10: Readiness to Learn 


a 
a a a a NS a 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
a ee ea ee eee 2s ee 
1 Single -41 (R1) 45 (R*1) 
iY Pooled MOT -URK) Vi Re) 
2 Single .44 (R1) -49 (R*1) 
2 Pooled 10 (Rk) -74 (R*k) 
3 Single CoICRL) 227 (R*1) 
3 Pooled 44 (Rk) .53 (R*k) 
TABLE 30 


Means and Standard Deviations 


Factor 10: Readiness to Learn 


Clinician Rating Condition Mean Standard Deviation 
a ~ Interview 3.35 .65 
i, Test 2.85 .85 
1 Combined 2.95 . 86 
9 Interview 3.83 1.07 
2 Test 3.25 92 
2 Combined 3.37 81 
5\ Interview 3.97 99 
3 Test 2.93 1.09 
3 Combined 2.73 1.06 


i 


66 


rary 


—_—_—— - a ‘ofa - parte 
ca : 


(17) Le. 


(i) 88. 


(i%8) en, (LA) +e. ‘OLRM Ie 


(##5) Av. (AH) OF, 
(i*8) TS. (4a) iS. 
(fa) Ed. (42) He. 


znottpived bastnet2 bas ane 
cvacl ot aeontbeok 7OL so38 


RN A 


noitsived Bbrsboet? 


67 


Factor 11: Management Level Planning and Problem Solving 
Factor 11 is described as "the individual's ability to recog- 
nize the full depth and breadth of situations and problems and to 
consider the longer range, as well as the here-and-now consequences 
of their change or resolution" (Appendix 1). In this study, 
Factor 11 is appraised by the Watson-Glaser Critical Thinking 
Appraisal, Differential Aptitude Tests: Verbal and Abstract, 
California Psychological Inventory, Edwards Personal Preference 
Inventory, as well as by interview evaluation. 
In comparison to the results for Factor 10 just presented, 
the results for Factor 11 are encouraging. From Table 31, we see 
that no mean differences, for any of the three clinicians, are 
significantly different from each other. There is a high degree 
of convergence within each clinician by rating condition cluster. 
Reliability estimates for Factor ll are also very respectable 
with values of R*l approximating .55 for all clinicians. Apparently, 
both in terms of mean variation and intra-rating condition 
convergence, Factor 11 is regarded similarly by all three clinicians 


for all three ratings. 


bam eps eee oift a8 ifaw as me eels 
_yubute eit at «(1 <fbasqg’): ino 
gattatdt fsoisi1o woasld-npe dBW sit yd beer 
-tosw¥edA bas fedxeV vetesl sbusitaa LabsaorwAAtd «. 
epgexstetT [sqoeied sbiswhd , viotaeval caotgetedowes 
noitsulsevs weiverstat yd BRL ISON eB et os or 
»betesestq tebt Of vesonl sot etiveex sid of aoaiasgnes oe, ’ 
see ow, LE sideT movl .gaigetuodie sis ff totost sot etiuaes. 6 
StB ,anGlotaiis seit of? to ae sot Pa cist oe: ae 
ee%gsh tgid s ei saedT «sito dose mort eeaeene o Cineot®. a ie 
»tetauio Holtibnes acitss yd asfoinils dose rintiw eerie a 
ekdetoegees yisy oelp oxs If sotosT cot sstemttee" WIE. 
oi . 


OD alesis ds ,ensisinifo cis 162 4 anitsnixorqgs — eulsy 
( 
flettibaes goiter-svtnl bas soitsiasy asam to amet ae 


gasisiaiio eerit fis yd viasiimte bebasgex et Cf aotest so 


oS 
on aoe. © 


TABLE 31 


One Way Analysis of Variance with Repeated Measures 


Factor 11: Management Problem Solving 


NS aaa saree ss a a ee 
a Se eee eee ee eee 


Rater Source of Variation Sums of Squares df F 9) p* 


ik Between 58.27 19 
Within Br Or/ 40 

Treatment 52S Z 5 le .79 BOS 
Residual 18.43 38 
Total 76.93 59 

Peer ter eee LOLA), ie PR en YS a a ee ee ee 
2 Between $5.83 23 
Within 34.67 48 

Treatment HONS) 2 .39 .68 oy 
Residual 34.08 46 
Total 120.00 71 

Pee eer Pe OTe perweees oo = SS ee SO ER fs oh nee 
3 Between 92.90 29 
Within 42.00 60 

Treatment 220 Z 14 687. a7 
Residual 41.80 98 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


68 


se eos a 


TABLE 32 


Unadjusted and Adjusted Reliability Estimates 


Factor 11: Management Problem Solving 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
Date era SER A aR erie Se aera A iret Aan ren 
ih Single roont RI) .64 (R*1) 
th Pooled .85 (Rk) .84 (R*k) 
2 Single £58 (RI) .57 (R*1) 
2 Booted . 81 (Rk) .80 (R*k) 
3 | Single Pol (RL) soa (R=) 
3 Pooled .78 (Rk) .77 (R*k) 
TABLE 33 | 


Means and Standard Deviations 


Factor 11: Management Problem Solving 


Clinician Rating Condition Mean Standard Deviation 
a Interview mis . 80 
L Test 2ice 1.24 
a Combined 2.40 Isi26 
2 Interview 2.879 . 88 
2 Test 3.04 1.40 
2 Combined 3308 Hie souls) 
3 Interview Pa | RoW 
3 Test 2.67 heh) 


3 Combined 2.67 138 


69 


i on | rr a 
, ae ae Bis fey i ; 
‘ n : 5 autbA ser 1 Bes seutbs sn 


bsteu[.bA 
:tilidstioA 
( (a) 23. 
(eH) aS. 8) Ce 
PrN NE IRC Ta Aah : 7 ; 
(ra5) v2. ({5) 8d. i efeate 
(fA) 00 > Cis 18. _ fakeet . 
Sh La a a : : nnsene 
: " 7 
(L#a) a2, (La) dec” akpore 
(1) oT. | (AG) BV. bsioot 
&& AIaAT : 
arottsived Bashrste bas enssM 
guivio? meidoxt taemegensM : Li totes © 
acoftstyss £ 1sbaese . ssi noistbno? aes ; 
og. wetvistnl 


Factor 12: General Energy Level 

General energy level is "the level of physical vigor and 
vitality the individual will demonstrate in his everyday conduct" 
(Appendix 1). This factor is sampled by the Edwards Personal 
Preference Schedule, California Psychological Inventory, Management 
Aptitude Inventory as well as by interview evaluations. 

Except for Clinician #3, clinicians do not differ in their 
mean ratings between rating conditions for Factor 12. The significant 
difference between mean scores for Clinician #3, wien is summarized 
in Table 34, is again indicative of a difference in mean pegging 
across rating conditions. A Newman-Keuls multiple comparison 
between the results of Clinician #3 indicates that, as was the case 
on many other factors, the mean of the interview rating condition 
is significantly higher than the mean of the test condition. 
Apparently, candidates are rated "more generously" in the interview 
condition than they are in the test condition. 

Achoued Clinician #3 differs from his two counterparts in 
mean differences between rating conditions, he differs very little 
in obtained reliability estimates on Factor 12. All clinicians 
obtain R*l values of approximately .28 indicating considerable 
ranking differences between rating conditions. With such a low R*1l 


value, differential diagnosis is a considerable possibility. 


aetna evriotasvel sseobnainece simrotiled: 9h yf 
-emoitaulsve valysodati yd as cisw a8 

shad? tt wettb ton ob aasiointio .S% metokekto wo 99 
dmaoltingie si? .Si sotos7 «0? anoittbnes paisst asewtot 
basi tsammure at ftotilw ,S% cebotatld cot zescoe nat, neowted 2 \ ise 


a 7 eC 


yeiggeg dase ai gonsiTsiiib 6 to svidsoibat aipgs et et 
noetraqaes siqititm alved-nemwe A s2mot yt Bros, anise 3 a 
ees> eit eBw ee ,t5dt aetsolbat ek asinintily) to ativeet off) OW: pred —T 
feltibios gaits: woivvaini sit to asan sit. ,erotost rorfito wese os 
snottibmos tss7 sit to aeam edt neds asdgid citar ra : 

1 

~~ bag al 

steigibaoo test edd ai ste Youd Med? fox: at 
mi etaaqretoues ows eid mort Ranewen ce se ae 
elivil wiev 2istthb sii .caottibnos gives assed aeonnssTiAb a 
englaintio WA .St totes ao sotanites vtithitsiten f bork 
sidsiahienoo ymitsoi pai es. vistemixorggs, 2 esuiey aie 
TAH Wal 6 down fitt sate izibaos ‘gntter "yonud Macnee! 
Wiilidiaseg slisusdienos 6 et concubine isis . . 


wetveatat ett ak "ylavorsias syom betsy sis astBbEbaBo «A 


tt a i 
A _ Gl a 


. © Ly 
Ag ; | | 4 

ww jes 
i ,2 


TABLE 34 
One Way Analysis of Variance with Repeated Measures 


Factor 12: General Energy Level 


a aa ae Eee 
SS aaa eases ESE EE Ee eae 


Rater Source of Variation Sums of Squares df 1 9) had 


i Between ABO) 19 
Within | 10.266 40 

Treatment 70 2 1.33 2, =20 
Residual 9.97 38 
Total 20.85 99 

ee OL. he es eer i ae ee ee, EE ey ee eee 
2 Between 23.99 23 
Within : 21433 48 

Treatment Aol ae airelte™ BENS 45 
Residual 207 80 46 

ae a Totalinterwiews 2 57 PRR os Re WWE See 
3 Between 23.0080 29 
Within 24.67 60 

Treatment 2100 2 3.42 O04 SO 
Residual 2270) 58 
Total 48.50 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


viet 


L 
phaGIS RE 


q ee 


TABLE 35 
Unadjusted and Adjusted Reliability Estimates 


Factor 12: General Energy Level 


Clinician Source Unadjusted Adjusted 

Reliability Reliability 
ik Single -25 (R1) -26 (R*1) 
1 Pooled iO CR) ood (Rak) 
2 Single Solo Rd) ~30 (Re) 
2 Pooled oo7 GRC) .57 (R*¥k) 
3 Single -25 (RI) -28 (R*1) 
3 Pooled .50 (Rk) -o4 (R*k) 

TABLE 36 


Means and Standard Deviations 


Factor 12: General Energy Level 


a 


Clinician Rating Condition Mean Standard Deviation 
uy Interview 3565 48 
a Test 3.40 5 One) 
ay Combined 3.60 .66 
0 Interview 3.58 76 
> eee 3250 64 
2 Combined 3.71 93 
3 Interview 4.07 -68 
3 Test Bolo y/ 5 KO) 


a Combined 3°77 x18 


ae 


feved, tes 


— rr 


- —— en 
bataufba beteut baav 
ysilidsifven yvittidstieA 


f 
GAAS) Se, Vee (4%) Gc. 

( £48} Of. (7) Le. 

Gata) Ve. (ia) Se. 

(18H) 5S. (LA) es. 

OD ea. (An) 08." 


oe Giaat 
snortetved btebasss bas easel 


fr 


loved yauend f6xemed :Si tofost 


' 
ar ee en —————— 


AP a ON LLL LLL A 5 — 
noivtsived brsbast2 a5oM foltibned gated netsintI> 
—— a mee? ——- op ae ee _ ain ery 
; 7 cr) hae 
Bu, co,& - wo ivvatal er’ 
7 A : 7 7 
Be. 08.8 teeT 


gay OF ; 08.6" : pai bsnidmoo 


ac. a2.6 . weivaatil 


ed. teot 
fe, 
Ba 
qx 
7 av. 
> ih 


73 


Factor 133 Bificiency of Application 

Efficiency of application is defined as "the economic and 
productive organization and application of work time and effort" 
(Appendix 1). It is sampled by the Management Aptitude Inventory, 
California Psychological Inventory, Vocational Preference Inventory, 
Test of Practical Judgement as well as by interview evaluations. 

Table 37 summarizes the ANOVA associated with Factor 13 for 
all three clinicians. As is evident from the table, there are no 
significant ‘MES scarcer within each clinician by rating condition 
cluster. Table 38 summarizes the intra-rater reliability estimates 
for Factor 13. For Clinicians #1 and #3, it appears that the absence 
of significant mean differences between rating conditions is 
complimented by substantial R*l values approximating .50. 
Clinician #2, however, does not match this level of convergence 
obtaining an R*1l value of only .16. This value would make the 
reliability of any individual decision, based on any one rating 
condition, tenuous. As should be expected, R*k values are in 
‘close correspondence with those obtained for R*l. However, even 
the R*k value of .36 for Clinician #2 is of concern for prediction 


purposes. 


pods yd botqmes af 2 —- ri 


@ 


dias potermtert Lesolses0V eXictasval ‘eoLgoLor 

-exdtisylevs weivastci yd 26 ffew es tnomegbul obY951 , 

rot ££ sor087 dtiw betsioos2s ANOuA aris eos leainatre vs af 

on ets sted? ,ofdst edt mort tasbivs 2f 2A eibtotnd £9 

moitibos _eitat vd npiotaifo toss) aids iw eseaeasai’ . 

agsemizes ysilidsiie: sstet-st7al oft eextiemmve 86 eldst 
ssaseds off tet eisequs +: .S4 bas Lk ansiolnifS tot 82 

ek enotstibnom xtitsz jgawted epadeaeeie som . 

02. anitamixory roe 2oulev I*8 Isitastedue yd b 

sonsgisvmo to ievel cin+ dotem tom 2s0b .tevewod .s¥ maioin 

edt ster Hivow sulsy etd? .dL. ylno to sutev it ae sg - I 


gaives sno ye ac boesd .norztoeb {eubivibak yas 2 


at ene coulev Y*% .botoedgus ed bluod2 2A «euopast ,mokeit 


fipye ~tevewoH 9 2{*5 sot bsatetde szody f+iw somone ae 
} 


mohtoibenq 1ot misono> to ef SH aeisini{d 1d ab. to sutey 2 


TABLE 37 


One Way Analysis of Variance with Repeated Measures 


Factor 13: Efficiency of Application 


cre eg ee ee ee ee 
ee ee ee 


Rater Source of Variation Sums of Squares df F Pp p* 


uy Between 24.40 19 
Within 14.00 40 
Treatment ~40 2 .30 08 ~46 
Residual 13.60 38 
Peewee ne Stal... ee Wes a ee Ee Se ad ae ee 
2 Between v4 6 8 23 
Within 26.67 48 
Treatment 86 2 AY iA 47 .39 
Residual 25.80 46 
3 Between 66.99 23 
Within 28.67 60 
Treatment 69 2 ~71 ~49 ~40 
Residual oe 98 
Total 95.65 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


74 


i] 
core, <a e Paes 
a a 
NEA ig 
oe oe 


ane at 


7 7 af n : 7 ; _ A : ° 
#q | ac came Bas 4 a Be SA, ae 
: 7 tae ; | R - x i C 4 : ’ 
ef Ooh we  assutod 


oe 


Ba 3.98 - ay) 4) T-ranee von 
; . i 


TABLE 38 


Unadjusted and Adjusted Reliability Estimates 


Factor. 133 Efficiency ot Application 


er sss... = 
A 


Clinician Source Unadjusted Adjusted 

Reliability Reliability 
1 Single mR.) -46 (R*1) 
i Pooled -73 (Rk) -72 (R*k) 
2 Single -16 (R1) -16 (R*1) 
2 Pooled 37 (Rk) -36 (R*¥k) 
3 Single -56 (R1) -56 (R*1) 
3 Pooled -793 (Rk) -79 (R*k) 

TABLE 39 


Means and Standard Deviations 


Factor 13: Efficiency of Application 


NS SE RS SS RR A RA RR A RS A 


Clinician Rating Condition Mean Standard Deviation 
te Interview 3.70 64 
1 Test 3.90 -97 
a Combined 3.60 paris 
2 Interview 3.17 80 
2 Test 2.92 86 
2 Combined 2.96 73 
3 Interview 3.20 79 
3 Test 3.00 1.18 
3 Combined 3.17 1.10 


RE a IT TS ET SE IE ELE ET 


718 


oe 


mr a 
. te 


| beteutisa be youtbsal 
yilideiien uri lidet isn 


(£*R) Ge. (ra) 4. 


(o/h CRE e eS Se 
(2m) Ox. ( pay OL. olsnie 
(fa) 8. (ah SES bstoot 
(#2) oc. Et) 9f>' me: aigni2 


ifn) © - Gn: \ '  Bekees 


Ps A722 AT 
( oie 
stoiteived brebosté bas ens : 


foitesilagA to vanesoltid :6f todas 


a 


1 ie act amma th RC CC LLL LL AL 


oolteived brebret2 158M aoitibaod yritsa 


ae 


RIN © epee 


#0. Or .c wetvretal 
ve. Oc.6 teoT 


eX. 03.6 beaitdmod 


vey V4e6 velvastal 


ae, ce.8 ee 


- 


ee des bontdieS 
: : , 5 Ga * 
5G. ait OES oe 
F ey. | a welvietal 


76 


Factor 14: Self-Confidence 

Self-confidence is described as "the degree of basic security 
the individual feels in his own ability to deal adequately with most 
situations and people he encounters" (Appendix 1). This factor is 
sampled by the California Psychological Inventory, Edwards Personal 
Preference Schedule, Management Aptitude Inventory and by interview 
evaluations. 

As summarized in Table 40, significant differences between 
rating condition means are evident for Clinicians #2 and #3. A 
Newman-Keuls multiple comparison between mean differences reveals 
that, for both clinicians, the mean of the interview rating condition 
is significantly higher than the mean of the interview rating 
condition. For both of these clinicians, test results moderate 
interview evaluations to yield a combined rating lower than the 
interview rating but higher than the test rating. This is not 
the case for Clinician #1 where, if anything might be said about 
the statistically insignificant differences, it should be that 
interview evaluations moderate higher test ratings. 

Reliability estimates for all the three clinicians range 
from barely acceptable (R*l = .34) to quite credible (R*1 = .55). 
Roughly 20% of the error variance vis-a-vis reliability is controlled 
by averaging the results from all three procedures rated indepen- 


dently (R*k). 


A 


yD) Wels 


‘¢ dah BENT ke seibnagaA) , ad sigeag -_ rok $s wr: 

or oy 
Lstoesst Hague ee, Lat geLodovet stowobtiao ove re 62 
ileal yd bas, yrotmeval sbustiga troneysisM » caiubsdaé 9 : 


* 
aan 


ssewsed eenmateti cb +ttenitieagre OF sfdelT at Sesissmmia: oh 


. A .8% bas S¥ nb taiet ana aot tnabive ets 2Ns5i nOLTEBAOS § 
elpeovet aepuenstiib: (them maawtad moelreqmcs oigtt Luu of: 
noltibsoo aniter weivyetmi sft to masa sit , ensioiai{s ated: a At 
anktsy woivastai sdt to hapa ed? cadt asdgid vasa age 2 at 
stexebom etives: teot .cttstotarfo sesdt to died 07: soos 
of} madt tewol gnittey benidmoo & ar ot ney 
fom ei atdT adits: tast edt asd? redgid tod gaideaie toi 
dota bise ed tigim ynidtyas th .atsdw rk npbotatLo 10 sie 

\ é | 
eit sd bivorte ti .csomsrettib tasoltingtent yitssiret “ne . 
.egaitss fee? tefigid sterebom émoitssIsve + 

egret ensfotails start sit {fs tot eetsanitas yiitidnile® _ 

Ade. = D9) afdibero stiup ot (#6. = £85) oldetqssss. a ad mo7 
belfortnos ei yiilidsites eiv-c-eiv soastusy rome ef¥ 20 fos = 
oe bets1 asurbaoc rg seat Sis movt 2tivess ads a iger - 

eA) a 


von 
* 
y 


j 


ie 


TABLE 40 
One Way Analysis of Variance with Repeated Measures 


Factor 14: Self-Confidence 


enn II ee ee 
a eee 


Rater Source of Variation Sums of Squares df F p pe 


1 Between 22,98 19 
Within TOT67 40 
Treatment 200 2 AS) eG . 20 
Residual 9.77 38 
Total 33.60 59 
a Between 26.00 23 
veut 24.00 48 
Treatment 3508 2 4.04 “O02 06 
Residual 20.42 46 
Tatal 50.00 7a 
3 Between 35,65 29 
Within 2 iendid 60 
Treatment 2.82 2 e253 503 07 
Residual 22.51 58 
Total 60.99 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


fi) 


7 ADS TTS SS DE PES es » a — ES ae 


. ‘ 
es . —- a 
= : 5 - ~ S = 
: atin , 
Bi - ie - * 
: a ' Py - 
a] 5 » te 7 
7 - e . 


Be fe.3s, 
28. | ek .08 


ea a= 


Igupeni) 10% eeonswoLle eexem doidw 7 20° 


; 


TABLE 41 


Unadjusted and Adjusted Reliability Estimates 


Factor 14: Self-Confidence 


SE A ee ee ee ee ee Se ee 
SLL aaa SS eee eee 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
ee ee RECTED = Se een Wek AES TEE 
1 Single Poa. Gab) .55 (R*1) 
i Pooled wo ORK) .79 (R*k) 
2 Single 30) (RD) .34 (R*1) 
2 Pooled 56 (Rk) .61 (R*k) 
3 Single .39 (R1) » Te4e Gre) 
3 Pooled .66 (Rk) .68 (R*k) 

TABLE 42 


Means and Standard Deviations 


Factor 14: Self-Confidence 


Clinician Rating Condition Mean Standard Deviation 
i Interview S570 .64 
uk Test 4.00 -89 
i Combined 3.85 ‘2. 365 
2 Interview 3.75 83 
2 Test Saeed 76 
2 Combined 3.94 81 
3 Interview 4.43 80 
3 Test +. 00 86 
3 Combined phe 75 


stant celre secae en aoS=erc PAI eC  a  S e 


78 


1 


(E4A) 20. 
(#AH) er. (18) Bt 


(£4) #2. 


(fARY we. (£9). 08. 
(8A) 13. (AB) 3¢. 
(18) cA: CLA) Oot 
(AHA) B83. (AT) 80, | 
si sugar 


snoitsived basbust? bas enseM 


soaebitnodeties sf 2os5s7 


moitsived pxebast2 npslt 
0. “WW “oye 
by oo.# 
23, 28.E).24) 
: es. avi 
ar 1.6 
if. GE ; 
ie Pay 
OB, 00. 
ey. c peel ay: 


fas) 


Factor 15: Supervisory Effectiveness 


Supervisory effectiveness refers to "the individual's 
habitual effectiveness in directing, co-ordinating, and controlling 
subordinates in standard work settings" (Appendix 1). This factor 
is appraised by the California Psychological Inventory, Edwards 
Personal Preference Schedule, Management Aptitude Inventory, 
Supervisory Practices Test, Test of Practical Judgement and 
interview evaluations. 

As has been the case for many other factors, there is a high 
convergence of mean ratings within each clinician by rating condition 
cluster. Candidates appear to be rated on the same yardstick in 
each of the three rating eondirione: 

If, as noted above, candidates are being rated with the same 
eee in each rating condition, they are not measured identi- 
cally in each case. Individual case (R*1) reliability estimates 
for Clinicians #1 and #2 are only low to moderate (R*1l = .30 or .40). 
Clinician #3, however, is remarkably consistent in his ratings of 
supervisory effectiveness between rating conditions (R*l = .56). 

For him, the possibility of differing diagnosis as a function of 


assessment condition is reduced. 


dotseh lt | «CL xibasqad) ccbatins. od Santa ab saamnedi 7 
ebapwh? .ytotaeval iss pelea: stave Lied, edt a 
sVtotneval sbutitgh snaoiegecieM, egies ira 
bac tnomegbul Isoitosrl to teat tae? — cneinhge 
dgist ge. 2k exett ,avotos? tedito yiem tot seso odd maed li 2A 
apie tbaos goiter yd cptoimiicniase midsin agai tet nesm Ye sommes 
at Asitebisy omse scT 10 badad. ad oy a estsebibnsd peenlaiton, : 
.enottiSaoo gattee ssadt eft to fose 
emsa add doiw bets1 soled ots eodabtnico (aed atin 25 .21 


-itasbi bewesem ton ets yout -wottibnos gaiter diss nt folsebamy 
eotemites yilids tien (2°) S269 ‘tevbivibnt geRd dose al yee 
(Ou. so 06. = L*4) stexehom ot wol yino ec6 oi bate Giceesanaane 
40 egattes eid mk tasseianoo yidetxsmer at apie eh aetobntto 
.(a@. = £48) anottibnos gaits: nsewred cena : roalvieqe © ae 


to aoktonvt s e6 efzonasib gatsestib to <eUitinieang nds sabe = 
-beoubst 2t moisibaoa sromensone 


TABLE 43 


One Way Analysis of Variance with Repeated Measures 


Factor 15: Supervisory Effectiveness 


(GEG Saaeecies eeeeeeeeeeeEE——E—E—EeEeEeEEEE EEE EEE Ss eee 
SS a a a eee 


Rater Source of Variation Sums of Squares 


F 


p* 


SS A 


mOe 


5a) 


en 


Seles 


a8 


2a 


. 80 


1 Between ZB aL8 19 
Within 20507 40 
Treatment 1.20 2 
Residual TOT 77 38 
Total 48.85 59 
eee oe ee ROUGE 8 8 a Oe ee ee re, ee 
2 Between 21578 23 
Within 20)n, 48 
Treatment 1.36 2 
Residual 13230 46 
Total 42.44 FAL 
3 Between 67.83 29 
Within 28.07 60 
Treatment 207 2 
Residual 28.60 58 
Total 96700 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


80 


Be 


oo 92 32 J _ 
Leupeau yot eenabwolts gaia ini oe be 
omens 


a i helen. ee 


TABLE 44 


Unadjusted and Adjusted Reliability Estimates 


Factor 15: Supervisory Effectiveness 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
a eR AE ol aot mah ES eR ee eee NGL 2 hl a IE 
1 Single .38 (R1) -40 (R*1) 
1 Pooled woo (RK) 767 (R3K) 
2 Single .29(R1) 30 (R*1) 
2 Pooled 55 (Rk) .56 (R*k) 
3 Single .56 (RL) .56 (R*1) 
3 Pooled Te BOM RE) .79 (R*k) 
TABLE 45 


Means and Standard Deviations 


Factor 15: Supervisory Effectiveness 


Clinician Rating Condition Mean | | Standard Deviation 
4 Interview 3.30 78 
di Test 3.65 1301 
2 Interview 2.87 Be 
2 Test 2nOe 76 
2 Combined WEEMS) od 
3 Interview 2.87 72 
3 Test 2.83 Leah 
3 Combined 2.80 1.08 


81 


(£82) 00. 
(tha) va. Jaa Loo% 
(18H) 08. (gos. oigaté adv “se 


(A8A) de. cc. : : - 


a, 
a) 


(A) OY. 


rit 
aad AGAAT 


at . (ty a 
anoiteived ie brs anseM a 
anogavitostid erseivegue vem | Hote’ : 
rs if be 
: 
- , 


fO.L 2a.6 
; est sae 
oie al aus + 


82 


Factor 16: Autonomy 

Autonomy is described as "the degree of the individual's 
need to make his own decisions, regulate his own behavior, be his 
own boss, etc." (Appendix 1). This factor is appraised by the 
Edwards Personal Preference Schedule, California Psychological 
Inventory and interview evaluations. 

Table 46 indicates that there is a similarity in mean ratings 
between rating conditions for Clinicians #1 and #2. Clinician #3, 
however, obtains a very significant F ratio indicating differences 
between rating conditions with respect to mean ratings. A Newman- 
Keuls multiple comparison between the means of the three rating 
conditions indicates that the mean of the interview condition is 
Significantly greater than the mean of the test and combined 
conditions. Test results appear to moderate interview evaluations 
very considerably when rating in the combined condition. 

Although there are significant mean differences between rating 
conditions for Clinician #3, we see from Table 47 that, once mean 
differences are removed as a source of error, his convergence 
estimate (R*1) is considerably higher than that observed for the 
other two clinicians. This points up the necessity of considering 
both mean differences and intra-rater reliability when discussing 


convergence. 


sid sd, 
edt it fboeisiqqs 2k TroTOBs ‘stat of ‘nbonagapho" de 2 


; a 
{aotgofLodoyel sinyoitis . shubedse conoxstend [ance 7 be 
-anottsulsve soivantal bas y ; osas 


asaitsx neem mi ytiaslioie 5 #i sted ted2 es oe otaet im” 
a 
et natotalld .S% bas fh andsiofatlo 072 aifobtibnoo gate ao whed 


eg r a, 


ssnnarsttif cite thnk obter 1 tnssitiagie yisy 6 sethatailes 2 wvewod 
ls ue oe 
eaitsx ason of toeqest ditiw eaohsibnoo gaisas a 


-fismwell A 


-23G 


attsa sesdt+ sdt to ensom sav fSawted soeideqnoo a ae a 
aotttbaos wsivasiat edt to neem odt teds qa7eokbat ae 


baatdmos bas test sit to msem sit nedt teteorg saad tagte 


= 


yas 

< e ‘ : ae, = 

eqottsulsve wetwietat etetebort of sbsqas etiveasr tesT .enostitbmoes 
_— 


- 
moltibros bemidmos off af saites nedw vidersbien 2 ys" 
ds) bi 
anissy csewted esomarestith Asem JasoLt ingle ays etrerit dguodtith —_— 
, 
aspem sono ,tedt Vi ealdsT mort see ow ,6% nsioinito idiot ano. 


sonenievmom eid ,1o11e to sate 5 26 bevemex sae es —- 4 


ane 


eit 101 beyrsedo tadt Pere Ta yidsrabiemon ai (ate) 0 emi Fes 


, a 
ihe Pe 


gnirebisaoo to yttensos n odt gt etntog eifT -edstoinite - terdtc 
anisevostb morw ytitfdsifer tater -pxtat bat osnonaRRID a 48 


: : <li 


TABLE 46 
One Way Analysis of Variance with Repeated Measures 


Factor 16: Autonomy 


nn 
SSS SSS SSS SSS SSS SSS 


Rater Source of Variation Sums of Squares df F p p* 


i Between 22.98 19 
Within 20.00 40 
Treatment 123 2 25 Scie) 28 
Residual LO76 38 
Total 42,98 59 
2°. Between 32565 23 
Within 28% 67 48 
Treatment 2.03 2 Wy AS: mks) 20 
Residual 26.64 46 
Total 614,32 ale 
3 Between 61.29 29 
Within 40.67 60 
Treatment 9.35 2 4.40 02 O4 
Residual 35. 31 58 
Total TOL OG 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


83 


: ate) 


at 


a % ae 


a FS ee Sw ere | 
er. rw Ps 


os, BL. atm. £0. newt aout A = 


TABLE 47 


Unadjusted and Adjusted Reliability Estimates 


Factor 16: Autonomy 


Te aa ec ee E  LE LIRE IE IEATE EL RA DEAT DE 
Sr pees ean ge ge ee eS ee 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
te ee ee ee ee ne ee 
al Single wooPCR I) .33 (R*1) 
Hs Pooled soo URK) .59 (R*k) 
2 Single eolCRI) .33 (R*1) 
2 Pooled .58 (Rk) 59 (R*k) 
3 Single »HIO( RI) -45 (R*1) 
3 : Pooled -68 (Rk) -71 (R*k) 

TABLE 48 


Means and Standard Deviations 


Factor 16: Autonomy 


Clinician Rating Condition Mean Standard Deviation 

it Interview 35935 oN 
Z Test sa2O ake 
a Combined 3.00 .89 
2 Interview 3.08 aoa 
2 Test idk .93 
2 Combined 2.75 : OF, 
3 Interview 3.37 1,05 
4 Test 22383 1.00 

2687 Wey Ons) 


3 Combined 


84 


“eB ra M ; 
v os L 
bs he 


. 


es ae 
a 


: (of#A) 2. (ri) 82. = 


(i%a) 2. 


(481) st. (a3) 89. paloos 


ne a a 


moitsivel frsbastc 


Ve. 


J “aes 


- 
. oa 
7 ‘sh ~ 
- 7 e : ; a F : 


(£3) Sb. 
(AR) vc. 


(fA) £6. 


i 


(in) Lf. 


6s SHAT 


enoifsiveti brsbast? bas enseM 


yaonoIva se@L scezost 


— 
neat dottthneS gaizst 


volwresel 


a 


—— 


BS 


Factoril]: Responsibility 

Factor 17 refers to "the degree to which the individual lives 
up to personal, professional, and business obligations he has tacitly 
or otherwise accepted" (Appendix 1). This is assessed by an inter- 
pretation of the California Psychological Inventory, Management 
Aptitude Inventory, Edwards Personal Preference Schedule and 
interview evaluations. 

Table 49 summarizes the F tests associated with Factor 17. 
The results of Clinician #1 indicate a marginally significant 
difference between the means of the three assessment conditions. 
However, for Clinician #1, a Newman-Keuls multiple comparison 
between mean differences indicates oe although the overall br 
ratio is significant, no individual difference between mean pairs 
is great enough to be considered significant. Clinician #3 also 
obtains a significant F indicating significant overall differences 
between groups. Further, a Newman-Keuls multiple comparison reveals 

a 

that the mean of the interview condition is greater than the mean 
of the combined dating condition. On referring to Table 51, it 
is surprising to note that the mean of the combined rating condition 
is lower than either of the test or interview rating conditions. 

Intra-rater reliability (R*k) estimates are also moderate for 
all three clinicians for Factor 18 being in the order of .30 to .40. 
Apparently, candidates are rated differently in the three rating 
conditions, but differences in rating made in a positive direction 


are nearly equalled by differences in rating made in the negative 


praia ei on 


aqosnt ms vd bozeszes 2k atdT ¢ 
dnemegeaeM , yrosreval isoks ; syst tn 


bos sivbedad soneistart Fae 


i ‘toroei dAtiw batéisozes etest 4 ort we oct 
tnpoitianie yileninwen 6 steotbal I neat ules odT 
,enottibacs tneméesess sstdt ea to anseom aa asewtod ah onli 


ed 


noatssqmes sigitism alve\-nemwe 6 . fh fetotatto tot ‘etevewoH | 

I Lifsteve efit devedtis .dadr astsotbert geonststtib sem asswied . 
atisq assn meswied aomouettib Isebittbni u etreoltingie ef obtss 
oeks 6) netoini{? .tnsortingie pig! ed ot dguonds frog ak 7 
aesaetsiith| Lisavo tis- i tingie watrackind ‘ant rnpshta Riee 
alsevet fosinsqmoo efaizinm eiverk- nemwall 5 edtaih owns maaycod 
apem edt ost rStse 7 ai dot ES woiveatal ady to nsea sdt ted : 

$f .f2 sidsT oF yniaister 0 ott? bss yatta benidatos edt 20 : - 
noitibiaoo agites Bsrtdmos srt to meam oct ted ‘ston ot anisiaqme ob _ 
,emoitibnos gaits: wsivisini so teet+ edt to ashtis meds rowol ak 


7 
vi 


tot steusben cals sts sstsmites (A¥d) Viilidsilet istsrsrmih mt 
+08, ot 08. te sebto sit ai gmiad 8f soi9s7 acnemeieions See ts 
gnites seiit Sd ni yitnovetith betas exw seasbibnes 
re ee 5 nt eaeyirt iad sud 


direction (low F and low R*1). 


86 


TABLE 49 


One Way Analysis of Variance with Repeated Measures 


Factor 17:Responsibility 


reece 
SSS 


Se ES Oe 
Rater Source of Variation Sums of Squares df 1g 19) p* 


i Between 162277 19 
Within 16.67 LO 
Treatment 2.43 2 3.20 05 .09 
Residual 14.23 38 
2 Between 19.54 Za 
Within Ig. 33 48 
Treatment | 98 : aos 0 Scuil 
Residual 2S 46 
3 Between S552 29 
Within 37.33 60 
Treatment 6.75 2 6.41 .003 202 
Residual 30.58 98 
Total 72.45 89 


SS 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


87 


eeetidie: pao? sh ree 


Nal : Lae 


ts 


a gq 4 y. 35 aotaupe to. emu nottsie 


TABLE 50 
Unadjusted and Adjusted Reliability Estimates 


Factor 17: Responsibility 


Si atmos EEE EE ES SSS 
LE SS 


Clinician Source Unadjusted Adjusted 
Reliability Reliability 
<select a lec a et ace se EGE DSB 
7 Single 26 (R1) .30 (R*1) 
1 Pooled SLR) .56 (R*¥k) 
2 Single -41 (R1) ~41 (R*1) 
Ore Pooled .67 (Rk) .67 (R*¥k) 
3 Single -2e( RL) - 30 (R*1) 
3 Pooled -43 (Rk) -56 (R*k) 

TABLE +52. 


Means and Standard Deviations 


Factor 17: Responsibility 


Clinician Rating Condition Mean Standard Deviation 
1 Interview Sy Ae) 162 
aL Test Becks 5s 
aL Combined 3.30 TS 
2 Interview 3.37 63 
2 Combined 3.33 69 
3 Interview 3.43 1.02 
3 Test Bi (0) 66 
3 Combined 2.77 84 


Heenan enema eae apes STU TRS SSa SSD is SC TT I ET EI GR EEE RGR Ta TTS 


88 


(f#A) 0€. 
(188) a2. 


(L4H) f+. 
(48) 13. (iT) Ve. 
(149) 08. (14) #8. iz ie , 
(485) 8e- Git) 28 - peices ie 
“42 RACES ~ 
pee, 


anolteived brsbast2 bas acseMm 
yitlidtanedqasd 2\L ‘totos4 


fottsived brebr 6t2 


Sd. 
A 
ay. 
€6. 
ea. 
ea. 
7 $0.L. glen 
% 7 
aa. : 
7 os os xh ~ a 
J rote; ee 
ra at istte 
. : ; i TT. ay 
, 7 


Factor 18: General Suitability 

Factor 18 is described as a "self explanatory" rating 
(Appendix 1) in that it refers to the overall suitability or overall 
rating of a candidate. It could be likened to the measure of 
general intelligence in intellectual assessments in that a composite 
is presumed. 

Statistics associated with Factor 18 are somewhat alarming. 
Only Clinician #3 obtains a significant F and a further Newman-Keuls 
multiple comparison indicates that the mean of the interview 
assessment condition is significantly greater than the mean of either 
of the other two rating conditions. 

It is the low R*l values which disclose the most about the 
rating of this factor. Values range from a low of .05 to a high 
of .33, the lowest seen for any factor. This indicates considerable 
intra-individual ranking differences. Candidates are not viewed 
as uniform with respect to their overall suitability across rating 


conditions. 


89 


og wavnent it oo Senos iss 3 Le 


etieognos & tend ai atnomeasces ‘meootteset lai 


= 


.grimzats tadwemoa sxe 8L xovosT dvie an 


eiuet-csmwe vedtwt 6 bos T | tusoltiagte 5 an 5s ok 16 


ne sos 


weivietat oft to msem sit tedt glade 


adil to asom ads aedt 1ts9rR yitasoitingie et arte oe ‘a? 
2 — 
-enoitibaes a owt varito pe 


sit tuods teom sit saoiszib a nite eeulev 185, WoL oa al 3t 


es 


figid s ot 20, to wol 6 mori’ sgaBx aah sotos? aidt to yak: " 


i" 

ofdsiebisnon eetshibat 2tdT .vdtost yne a nese taswol eit eB Me tc 

beysiv Fon oss estebi bas -adonsisiath paltasy & ubivil t 5 en 1 
ae 

arttax ota ie ysilidsi ive. These seta ot a ATE 


\ 
+} 


90 


TABLE 52 
One Way Analysis of Variance with Repeated Measures 


Factor 18: General Suitability 


CRSRASAS SSS epeimmecemeeamneenemmerae a a a Se a ee ee we 
a mmm a AS 9. Se Fi A SG SRE Lee a an RS 


Rater Source of Variation Sums of Squares df F Pp p* 


1 Between aly ch 19 
Within 26.67 40 
Treatment 2 oaks 2 L265 20, Spal 
Residual 24.53 38 
Total 40.98 59 
See OER ner ger ae ne SN ee ee lg OP ae ae ee 
2 Between 24.61 23 
Within 30.00 48 
Treatment 3.36 Z 2590 70.0 adhe) 
Residual 26.64 46 
ee ei oh Tota gh pw eer So ee ee ea ee ee 
3 Between 2765 29 
Within 26.00 60 
Treatment 3.49 2D 4.49 01 O4 
Residual 22.91 58 
Total 5365 89 


p* = Conservative probability of F which makes allowances for unequal 


covariances among correlated measures. 


_ dh rte , 
e A : i i a x : 
a a : - , > 7 
sane. Ve | 
= 7; i : 
7 : e 7 7 7 . q 
| » ) Se SSGAT mn 
( = > Per Bor 
penuesoM speateas dtiw soasias¥ 20 ste yp os * 
yiiiiderine tevened 28h TO2oET re 


ip en a 
. seg ey st 
A LO A : - ica remy y ae sent erenpt 

tq g 1 35 eeteupe te aanid wolsstaay ¥o 6 


rr ae 
er re.at avowhsa 
o# id 8S : eke ew ae ' 
iS: Os. aa. $ 64.8 | tnemt eer? vate 
BE Feels tevteot a Te 
p Bey Bee bie oe 
; a a pe a 
a ES id. #8 . roowtod 
Sm 5 ms , 00.06 Io  bdszy 
OL. ao, 08.6 g df .£ taemtseaT 
[ a id, aS Tsubtest | 
a iv fa).#2 ra) 
; es 23005 
| oa” o0.a¢ 
#9. fo. ee Ss Pp. 
; 82 f2.$8 
= ees @g.£¢6 ; 


——— ; 

Esuponv 02 sooetaeolita estsm doidw F 20 mere 
re eatial . 
7 ' : 


1 ini 


st ? 
29690 = — gr08 


LE ae - 
Z 


TABLE 53 
Unadjusted and Adjusted Reliability Estimates 


Factor 18: General Suitability 


ee I 
re EEE 


Clinician Source Unadjusted Adjusted 

Reliability Reliability 
uu Single | -04 (R1) 205 (Re) 
iL Pooled - 12. (Rik) 14 (R*k) 
2 | Single ~2 (RL) .22 (R*1) 
2 Pooled 42 (Rk) -46 (R*k) 
3 Single -29 (R1) -33 (R*1) 
3 Pooled -59 (Rk) -59 (R¥k) 

TABLE 54 


Means and Standard Deviations 


Factor 18: General Suitability 


Clinician Rating Condition Mean Standard Deviation 
uy Interview 3.45 74 
My Test BOGS Lae2 
if Combined 3205 ek!) 
2 Interview SELF 262 
2 Test Pat fe 89 
2 Combined 7g al .98 
3 Interview 3.40 .66 
3 Test 3.00 82 
3 Combined 2.97 15 


a ean ee lala nai a a SC a ETN Tet Tear aE ieee aA ee a | 


a) 


(£8) #6. 


(Aa) Sf. 


({i) el. 


(fH) es. 


(49) 22: 


#2 GIlgAT 
enoitstve0 husbusre bus easem 
ysiltdstive IsveneD .8f tos08% 


~ tT eee 


coftibied gebtel  apkokakio 


waivers | 


92 


Inter-Rater Reliability: Test Condition 

As an added precaution against the possibility of clinicians 
remembering test profiles already used in the combined rating 
condition when they were rating in the test condition, all clinicians 
rated all 74 test profiles (their own plus those of the other two 
clinicians). As noted in Chapter 3, each profile was rated 
individually, in random sequential order, without identifying 
demographic information. A side effect of using this blind rating 
approach is that it is possible to see how closely the three 
clinicians involved in this study rated the same profiles; i.e., it 
is possible to obtain a measure of the inter-rater reliability 
(concensus) of test condition assessment decisions as well as the 
intra-rater reliability estimates already presented. Inter-rater 
reliability is important because it can give us some idea of the 
consistency of three clinicians in rating similar profiles under 
similar situations. If inter-rater reliability is low, the problem 
of good predictions is further complicated. Not only would 
differences in rating situations be important in so far as the actual 
rating is concerned, but the rating made would also be extremely 
clinician-dependent. Although in this study, and in most real life 
assessment situations, a candidate is usually rated by only one 
clinician, it is interesting to note how much of the rating given 
is "clinician-dependent" and how much is "'clinician-independent" 
with respect to assigned value. This says nothing about validity 


however, since high consistency does not necessarily lead to 


: i. 

nea Busted. eit ci Bead , Balitong t2st 
sai jaobstbaos seat St at hagas hot 
ows aie edt to seodt evid nwo atodt) oe oe ae 


| botsr aew olitox, dose .€ wetqsdd ai aA es 


gtiyittnsbi tuoltiw ,rsbs0 isitaeupse aes « vilsubr 


_ 
io i 


“sais o4 
tbe 


pitites brifd vidt gaiew to soatte ebie A. pry 
setit ad? yiseolo worl esse ot sidiezog 2f Th stadt as , 

SE , sack pealitorq omee sit bstex ybute eid ot a 
yrilidbliox xsts1-rethi ot Yo sauasem 5 aiside oF oidtezog al 

eit 26 Llew 25 aie tetcad tnemessees aqott bios Feet 2o sail 
qteter-setnl .betasesrq ybee tis 2oTsm nites ysitids hte? cossivanll 


edt to sabi emoe ev evig neo Ti senesced rarstsoqmi ak ——— 
‘ebms eofTiow tslimte gaitay qi agsioiatf{o sant oe aad 
fmelcoxg sat .wol ei ytitidsifes raters sith iE eoolieutie mLtmke 
bivow yino toh .bstzollqmo> vedseut et enoitoibsag pee 
igesos add as ist o& Gi tustiogmt od enoiteutie grits: at con 
viemertxe od o2ls bluow shem gnitsa edt sud -boncesnos ak anit » 
tit eax taom nz bas pVOUTe @int oft davodsiA.  sasbasqeb-i to ! 
ene ¥lao yd beter ylisuey st stsbibass 6 Pars eats 
sevig guivset oft to doum wod eto oF il ai tH 
| ““geobpaqsbah-asioiai to" ak noum wor Bas "th oh TE a F 
\tibiisy steds _nifton ayse aidT suey. banghaas 02 398qaH a 
oF beet wiaseroven 300 “eae 


a oe 


93 


prediction accuracy. 

The inter-rater reliability indices of the three clinicians 
ratings done in the test rating condition for all 74 subjects on 
17 factors are summarized in Table 55. Factor 3 is not presented 
since it will be recalled that Clinician #3 did not rate oral 
communication in the test condition mode. 

It is seen that the Rl inter-rater reliability estimates 
vary from a low of .19 to a high of .89 with a mean value of .62. 
With the exception of the R*1l value of .19 for Factor 12, all 
reliability estimates 2 .50. Factor 12, as was evident from 
Tables 34-36, was a factor with which all clinicians had difficulty 
in cross-group ratings; intra-rater reliability indices were also 
very low. It seems that, even with a single category of informa- 
tion, clinicians differ in their interpretation of ''general energy 


level" and/or how it is measured via psychometric profiles. 


= oe se £ sotoE7 42 wkaet ar | 
" “fano stea ton bib &# nstotnlt® jest bol 1 ad Litw +f son 
. abort npittbnos yale we on st vere 


Nea 
ne; 
a 


getsmites yriltdsites x1st51- ~sasat i*f odd ted ¢ 1852 et #1 my be 
88. to omfsv neem 6 driv <3. to dota & af -BEe to wot ry on ; 
tts gS Sosost rot PL. 10 sulsy 7 ait to colsqeoxs 

moat tnebivs 2ew 2s ,St “ofsei .02. £ eetsuttes ya 
Viieolttib bed ensicinilo fis doll djiw totsst 8 :aaW « aie if id eT 


ozis SI9W esoibat viilids tise ngter-c1tal pagoisex quer ™ 


= 
» S 


asmiotat to yioastes eipaie s dtiw hove , dedt ‘yt ore 
yates fevenss” To net tstorgwital “feds a7) wT enstotat Ip ,aoks 


-2efitorg sixtemodoveq siv bectenem at tr wod ao\ bas * (ovat 
| 


TABLE 55 


Unadjusted and Adjusted Inter-Rater Reliability Estimates 


Test Rating Condition 


Factor Unadjusted Reliability (R1) Adjusted Reliability (R*1) 
AL Aer . 66 
2 Pa a . 80 
4 ately) 09 
5 48 49 
6 Aloyy/ 369 
i .66 209 
8 65 Bole) 
9 SONS Ane) 

10 Ow yee) 
ry 287 Sieiel 
12 male. pee 
13 Boke Ao o 
14 al £50 
10S) 200 Bole, 
16 eile a8 
ay) Bole Aoi 8) 
18 04 Arche: 


94 


“aa, 


8 fo) Ay ae: 


(188) yrilidsiter borentSh 


3a. 


95 


Factor Analysis of Test Condition Ratings 

A factor analysis of each clinician's ratings on the 18 factors 
for all candidates (N = 74) rated in the test condition was under- 
taken. This procedure was deemed useful to assist in a discussion 
of the results just presented. It was thought useful to examine 
clusters of similar factor ratings made between candidates to 
establish possible communalities of ratings. It seems likely that, 
although factors have been presented as semantically and construc- 
tually idiosyncratic (Appendix 1), there are common ratings made 
on an individual between factors, i.e., ratings may be mutually 
interdependent. | 

With this in mind, a principal axis factoring with varimax 
orthogonal cavarion was attempted with the results obtained from 
each clinician's rating of the 74 candidates on the 18 factors in 
the test condition. Since this factor analysis is not central to 
this results section, findings are detailed in Appendix 6 and are 
presented here in summary form only. 

Using a earecehende Rae value = 1.00, each clinician's 
original ratings based on 18 factors were found to load significantly 
on five major factors. The percentage of total variance of the 
original 18 factors accounted for by the five new factors ranges 
from 71% for Clinician #1 to 60% for Clinician #3. 

The results of the factor analysis are presented individually 
by clinician. A tentative descriptive title for each of the five 


prime factors for each of the three clinicians is typed in brackets 


fi eae 


Le vy 7 _ ate ye S20 to. PES isnd 


ee iL 
axorost SL adt a0 sgnitst 5 cnet febe to: sleutsoe 1 te A 
eal - a. 
-vabau enw coltibucs teot oft at herst (AV = aa sti L 


7 ‘e ‘a 
aghaguoatb = ab fetees ot inteay Demesbh sew ac oR 


Stimexe oF Iuteel Theuont 26W tT aide gaut wets 


ot eadésbibnss meawisd obser egaites ee ae to 


ge ti .2aritsy te asittiaitvimapo yee 


rn ta 
-oirtanos baa vith sitfipmee ag beraszecg osueceal atalceacad ce an 


sq7A) st ar eds “. 


aban sadizex comico os stent tf xi dn 


- - 5 » _ "@™ 
viteutim ef yeor excites ,-3-f ,2toTost. neewisd saiieiaglalaa 0 


le te ae 
. irte brea TONER LS 


veniney dtiw gniactosi eins Legkoniaq 6 , batmnt’ erady ark¥ i, 
_ a. 


Baste 


wont banketd> efivens aft diiv betqnetts Shaw nobséitet ail [Src 


Hk sxotont BL 4H+ ao cotsbifieh aV edt So gathes slaskonetia dame 
of fedfasn ton et elevlers sdtost 2int sare foitibrop. 


sub bits 8 xibasqgé ni bolisteb sis e_nibnt? .nolsose esiveet 


a 


: s - fa 

ino mot yisieye mi sent Bt 

e"ablotnifo doss ,00.1 = sufsv aogtes nolssvige & gaia 

: tat e- 25 

Vitasoitingi> bsol ot bavot svew exctosi Bf co beasd agaigves 


i a bas 7 uf Dit : 

siit to sonsiaey Letox to assraso1sq sdT .emorset xofsm ove 
ol af 
- , 


2ennet 2totost wom svtt sift yd tot betavesss eset? BL isn 
sE% astotshid tot #0d oF LH pana a 


yiieubivibat bataese9 21s 2leyleas ‘notes oar oa 

oe 
avit sit to ope 01 elttt seoqiasanb ovtai 
, : 


96 


a 
immediately following the factors which appear to load significantly 


on that factor. 

Factor Loadings 
eliniciang#l 
FACTOR I: General intelligence + adaptability + potential for growth + 
readiness to learn + management level planning and problem solving. 
Total variance accounted for = 22% (INTELLECTUAL POTENTIAL). 
FACTOR II: Oral communication + leadership force + interpersonal 
effectiveness + self-confidence + supervisory effectiveness. Variance 
accounted for = 18%. (INTERPERSONAL FORCEFULNESS). 
FACTOR III: Efficiency of application + responsibility + general 
suitability. Variance accounted for = 17%. (RESPONSIBLE EFFICIENT 
WORK STYLE). 
FACTOR IV: Self-starting work drive + general energy level. 
Variance accounted for = 9%. (WORK DRIVE). 
FACTOR V: Common sense + self-reliance + autonomy. Variance 
accounted for = 9% (RESOURCEFULNESS). 
Total variance accounted for by factors I - V = 71%. 
Clinician #2 
FACTOR I: General intelligence + oral communication + potential 
for growth + readiness to learn + management level planning and 
problem solving + adaptability. Variance accounted for = 22%. 
(INTELLECTUAL POTENTIAL). 
FACTOR II: General energy level + responsibility + general suitability. 


Variance accounted for = 15%. (DIRECTED ENERGY). 


. fh. : ; 
+ dtwors rot Lslinstog + vtilt: sqsbs + vonegiiferat Levens) . 31 SOPoAt 
wgaivioe mofdoig bos giinnsly faval tnamegsnem + mipel oF cesakbset 
(JAPTUSTOR qauroauirent) ass = qo? betatosos sonsinsy fefoT” 
feagetoqreaat # soot qids sobssL + sottso brammos iss :TL SOTOAT > 
eoqekiasY. -2eotavitosits yicatvrequa + esnebfinog- tise + esensvitostie ; 
(22a4iUTIOMOT GAMOZASIAATHT) .F8L = sob betauooos 


{stsasg + yiilidtenoges1 + aoitesitqqe to voustoltts . ; £11, SOTOAT 

TUATSITiS GiglenOweat) .FTl = sot bstavooos, s5ashasv wetitidstive — 

{svel ygusns [staney + avinb yaow gaitiste-tise :VI ROTOAT 

(AVIAG AAOW) .2¢ = tot Botavooss somsiaey 

aposiys¥ .ymonotus + egnation-tlee. + satse nomad +¥ ROTIAT 

. (2eMIUTaIMeIA). #& = aot betauesos. 

410 = V - 1 stotost yi do? besmunoos sonsicasyv IstoT ; 

on Opie 

feiss netoq + nokssoinummos Lago + vonegttosae tersaes st sorus : 
bas aninaslg Isvel Inomagsnsm £ missi of epantboas + sorry me 

-f$$ = Yot betavcess sonsixs¥ Utiitdetqebs + nee. 7 7 

TELidedive Lsseneg + abt ideeabaden # yo oki 

Crnaana asia) wet = ane roneisey 


FACTOR III: Self-reliance + self-confidence + autonomy. Variance 
accounted for = 13%. (RESOURCEFULNESS) . 

FACTOR IV: Common sense + interpersonal effectiveness + supervisory 
effectiveness. Variance accounted for = 10%. (INTERPERSONAL FORCEFUL- 
NESS). 

FACTOR V: Self-starting work drive + efficiency of application. 
Variance accounted for = 8%. (GOAL DIRECTED WORK DRIVE). 

Total variance accounted for by factors I - V = 68%. 

Clinicians #3 

FACTOR I: Leadership force + general energy level + self-confidence + 
supervisory effectiveness. Variance accounted for = 14%. (DYNAMIC 
LEADERSHIP). 

FACTOR II: Adaptability + potential for growth + readiness to 

learn + management level planning and problem solving. Variance 
accounted for : 14%. (POTENTIAL ABILITY). 

FACTOR III: Self-reliance + efficiency of application + responsi- 
bility + general suitability. Variance accounted for = 13%. 
(RESOURCEFULNESS) . 

FACTOR IV: General intelligence + common sense. Variance accounted 
for = 9%. (PRACTICAL PROBLEM SOLVING). 

FACTOR V: Self-starting work drive + autonomy. Variance accounted 
for = 9%. (INDEPENDENT WORK STYLE). 


Total variance accounted for = 60%. 


oF 


scatseotiaas 46 Yoae 
 (avIRG NGOW AaTdatrG too 88 =-s0% Betnue 


80 =V- 1 septa (6. w0 hashedon & 
AS 

+ sonebiinoo-tisa + Level vavedes fsdeney + soxct qiiewebsad p MOTIAT 

niMavyd) .€k = 1ot Beralooas ssfstas¥ _ceonavizsetts qabekvaague ' 

: uy ae 

ut eesntbpen +} stwoxg 162 isitheiog the yvtitidstqeba 1D AOTOAT 

sonsinsY santvice meidoxg bab gataneiq Lovet nave 

. CETTE oa pres) $s 10% * - 

-Lzenogeat + aottastiqgs - woaslore + il al at 

FEL = tot botauones ech AteAMe -ydididsrio® is F 

«(22a 

batnuosss sonmsirsY .sems2e nommoo + epasgitisint tersnaa - : 


eee es yaaa: 


besavrooos SonsitaV 


98 


Summary 
Clinician #1 
Factors with significant differences between means: 4, 10, 17 
Mean R*1 value for all 18 factors= .41;. standard deviation = .14 


PaGbors With R 1 = Os- 9938018, 2 ,°179 we 


actors With RVIS=Pi815-Bieo 21 Gent sem ts, Cue yeaLoItrsetiin, 159 c16 


Factors with R*1 (6lese 00x 9SS07 
Clinician #2 
Factors with significant differences between means: 2, 5, 7, 10, 14 


Mean R*1 value for all 18 factors =.37; standard deviation = .15 


Factors with R*l 


Ov= . 30:41, noee7 plate seriel tie 
PaerOrsawi there =8. 31 Seer 2554 25. Net SRRLO aster Ge a. 


Factors with R*1 262 =11700" 8 


tt 


Clinician #3 

Factors with significant differences between means: 4, 85 9, 10, 11, 
Voemmiderey 17 as 

Mean R*1 value for 17 factors (excepting #3) =.38; standard deviation = .12 

Paetocennith Bells OM. arenes a5 7 opeloer12, 127 


P3115 senerl  oemuenee leering ie iu iiewsic. 1s 


Factors with R*1 


; 
TL .OL ,» ;2msem poawied aad 


aiaiel 
bi, = doiterveb busbrsta at dae ‘ot iis aot st aa 
SEAS 8 *OE. = 0= Api ee a 5 "T 
f cbt ett El <OL eV a9 sc 4? ee aang I :03. - at fe eines 10: cz 


££ 8.00.5 = a. % aren 


uf ,0f . ¢¢ ,S 72ns9m assuted aponetettib pet hon fi — 10: ie 
@f. = moitsiveb basttede hes exotos? BL fara tot auLs 


hy ee eres? 
BL ,2L -L Si ah eC <i 106. - 0 = IH dite auore 


i 
3 
fe] 


Tf ,af aD £f Of 40.0 «eg P eS) 200. = tee 


% 


n 
bs 
a 


8 s00\1 = fs. 
ff .Oc .2 .8 . Po cansem nsawied esomeeiiih seunan ante ‘eroto8 ; 

at Ti pot HE Sh © 

C{. = poissiveb hrebrete: 186. = (e% nai a egorost TE 10% ouley . oe 68 
TL (St OL 8 mh te 4 £6. ~ ~ pee ian pees! : 


BL ,df oot HL Poh tees 0 ev Pre tt m* 2b 200. = ie. % fhe ddtw 210308" 
7 


- ome il 
pa 7 a ; ; ; 4 
z rt ae Yh oy 


CHAPTER V 
DISCUSSION 

In this chapter, results pertaining to each of the clinician 
by factor by rating condition interactions will be discussed. 
Common themes will be examined by clinician and factor, an attempt 
will be made to explain significant results, the utility of 
convergence as a psychological construct will be examined, and 
suggestions for further research will be detailed. 

Within the context of this study, convergence is probably 
best viewed as a condition affected by both mean differences and 
reliability estimates. It is possible to err with the ratings made, 
doth from the point of view of the actual rating assigned to a 
aendidate within any rating condition, and from the perspective 
of differences in rating of a candidate made across conditions. 
The first difference, which is often described as mean pegging error, 
can be considered to be a constant. Errors of this type would 
result in comparison errors when inflated scores from one group 
are compared to deflated scores from another. This type of error 
is unlikely to result in errors when considering an individual 
within any group since rankings are not changed (each person is 
being measured with the same, albeit incorrect, yardstick). Mean 
pegging errors are also very easy to correct since changing all raw 
rating scores from all groups to standard scores will standardize 


between groups. 


From a consideration of the tables in Chapter 4, it is evident 


39 


nstotwito aft 2o rose of saitnateg ‘ 
wbodevoeth ed Liiv stobtsesesni motzibaep ‘edict Ore 
tqmedie as ,tosost bas ceioiniio yd bonkmsxe ed Litw ft | 
So virility sdt ,=tivest tassitiagie chelaxs ot shan ed & " 

bas ,benimexs od [Liw sourtesoo Lssigoladsyeq & es — 
.bolisteb od Litw dowsezet seditast 104 

yldeiowy ai sonsnisvcoo , youre aidt. to sxetaeo sif | lee 

bas geanswettib acom (tod yd begostis acishhnes & as bowaty seed 
~ehem agnites edt dtin tre of aidiezog 2i tI .assemivee esate 
5 of beagiers yaite: isuzas edt to wiv to tatog aiid mont dood 


an 


avidoegeteg sit moi Sits , not tithes iti et ms aldote enmittaa : 
-@noliibics asotos sham stebibass 6 to yoigss ai ~~ 


tote gebagor neem 25 bedinoegh agtte et dod! sprint We 


biwow eayt eit to erommd .taetenoo s od ot bemebs 
lade am Movi geroos beteftal aedw erorre mozinegmoo 
norie 2 oqyt eidT .cedtons mort estoce betelteb oF 
isubiviboi op guivebicnoo nedu etorre ai tiezow os va 
at aoereq done) begaso tom ets age t inet sonie wtih en ah Siw 
neem .(lobtebtisy ,foorionoi tiedis ,smse siz ddiw bet 
wet Lig gaignedo eonie toetsos oF yess yrov Onis S82 
eathasbuste [fiw 2ovoo2 btebaste of equoug, £1 mont comeo8 


tapbivalek 32 «Ht weiqsd2 ni eaided edz /2o nottexeblemos 6 moni 


s. 


“i a | pee ; a he 


100 


that, even when the difference between rating condition means is 
very statistically significant, (e.g., Clinician #3 on Factor 8), 
the actual numerical differences between the significantly different 
mean-pairs is not great. Thus, from observing the tables in 
Chapter 4 once again, we see most of the raw score differences 
between mean-pairs that are significantly different are in the order 
of .50 - .80. Since actual ratings made on candidates are in whole 
numbers within the range 1 - 5, it is unlikely that differences 
across rating conditions for any candidate would exceed one. Thus, 
significant differences between the means of the ratings made in 
each category may not reflect practical differences between ratings 
from the perspective of actual judgements made about that candidate. 
What will be said differently about a candidate who scores 4 on a 
factor versus that said about a candidate who scores 5? 

Low R*l1 estimates are a reflection of low concurrence in 
subject ratings across rating conditions once mean error has been 
removed. Low R*l values should be viewed more seriously than high 
F values since they cannot be eliminated by anything as simple as 
a standard score transformation. 

Low R*1l values may be thought of as reflecting either or both 
of two possibilities: (1) basic clinician decision error of the 
type noted in the equation, TRUE SCORE = OBTAINED SCORE + ERROR or, 
(2) real differences inherent in the information available about 
a candidate as a result of sampling in either of the three appraisal 


conditions which would cause even a totally accurate clinician to 


oot 


ai 2n5om aottibnoo at 906 
(8 xotosi a0 ee Paeiriey cae nee titel ' 
tasrsttib yulsasoitiogia ott avowted estnoaeee Leo tremit Leas, 
ar esidet sdt pitivrsedo nov g2udT .tsexg tom at valeg-anon 
asomeqstiib srose war oft 40 taom 955 9W , aise sofa # nergedo 5 
sebro off mt 515 tmowitib yitesofitagia em tadt evisa-asom meowsed 
slodwiat sis aetebrbmso no eben Seer frvsee eonke «08. ~0BHAG 
2someistt ib teas visitas ek Jt oe L syest oft abdtin’ esodmumn 
~2uniT .eno basoxs Biuow atebibaso Yas 08 amobsibaes gaitey aeomB . 
at ebpr agdizes oft 26 Bassat ott hoswted-eoguendy aie tasoitingie 
agibtsy neswisd sanqeTetiib [soitosyq tosfter tom ysm ysogstso dose 
.esebtbaes gedt tuods sbam ‘etmomsgbyy, Devdas to ovispaqeusg ott mont 
s fo # estooe oflw otebi base s tuods yitnorsttib bise sd Lie tah 
“ed ‘aetesavede siahinmed 5 tuods bise tadz cvs name 
ai sonsyiones ee nottosList ‘ ets astamites Dt wok | 
nesd esi torres mon sono enoitibgos RTitst SeorxSs eunitsy -tostdue ; 
dgid osdt ylauoiaee stom bewsiv sd roads aot sot lev £44 wol -bevoust 
26 siqmic es antdtyns yd betemimife od soansp yet sonia aie * 
| snoitsmrotensst satooe nae 
dteod yo szsdtie guitoslis:y 26 to sdguodt od yem eeulay LH wou erie, 
eit te soars nolgiosh asiotails otesd (L) sap sihdtoeat om te 
xo OAKS + AA002 CANTATAO = ALOIS QUAT <nolteups eit at betomiegy® 
twode eldsiisvs noitemiotatr sds ak tosmrodderd PE 
‘Esekssqqs ssuis odt 20 vodtie ai gailqmee to tivesa 6 alaimeuents! | 
of msioinilo etsuvoss yilstot 6 neve seuso biwow dokdw 


101 


diagnose differentially dependent on assessment condition (i.e., 
real differences in the quantitative information available about a 
candidate). The first possibility is that referred to by 
researchers such as Little & Scheidman (1959) or Goldberg & Werts 
(1966). The second has been ignored in the literature. 

It is this second possibility that is most frustrating for 
the researcher - and so face-saving for the clinician! It may be 
that differences in intra-rater reliability are differences, not 
due only to clinician error, but in differences in the ability 
assessed in each condition. This could also be thought of as a 
construct difference between factors which bear the same name in 
each of the three conditions. It may never be possible to separate 
these two types of "error", but it is wise to keep them in mind, 
particularly when discussing intra-rater reliability. 

It seems logical to presume that, when all three clinicians 
obtain high R*l values on the same factor, both types of error 
would be minimized. Similarly, when one or two clinicians obtain 
a high R*1 value on any factor, it is tenable to assume that the 
lower R*1 value of the other clinician(s) on the same factor 
reflects judgement errors (type 1) rather than real differences in 
the level of ability assessed by different methods (type 2). 

One would assume that the second type of error would be a constant 
between and within any given clinician by factor cluster with R*l 
scores which are lower than the highest R*l value obtained by any 


of the three clinicians being due to clinician decision error. 


ee 
r 19.5) not Bino9, noms 


5 tuods oldslisvs oidanrct int 

yd ot hotastor sedt el 

etyeW 3 gredblod to (ezer) cae 
sowie stil eft at bettorgl need 26! 

sot gattsextaurt teom ai teat WiLtdiseog Badeee we abs 

ed ysam $I tmetointf£o odd sot gaivee-poe} oF ba = -26oronenee 0 

tom ,genneastath sis vtilidskter cords neues a re. i 

ytifids edt wi eeonsretttb ami tud .to7v1e estotaiip'doytnnteyd 

& 26 to tdguods od cele bluoo eidT nol tibaoo diose ni beeesess me) 

aL sma empe sit  wsec foidw exdtost hebwiad' anno rst tounsencs 

stsisqse of sldieeog sd seven ysm $1 -enoistbmos sonitieit 0:dose 7 

»baim ab mes qoowt oF se eiw ek st gud , “ecu” 26 a lata nt 


»yititdsilor qorsr-srim sitteauoetb nodw useluoiiang 


oy 


roars to esqyt diod ,qotost emee =di ao eeutev rea digit abetdo 
* 


atetdo anaboints 5 owt Yo sho nedy oVlnst ike -bostnkate od Bion 
mosost emse silt co (u)mgtokmi£o xecito sit to siubev ~ sia 


anbicinifo serdt {16 nsw dated ddl oy Iseino!l emese 21 


edt tant omuses ot PERSE ei tr ,.votost yas qo suleyv sat a ay 


Imetenoo s ed bivow rors to eqyt basse oft side omuaee i eine 
ie ditiw aetevlo aotost yd. asiolatite ones hai visas f 


102 


It is also logical to assume that, when R*l values are low for 
all clinicians, or when there is considerable variability between 


the R*1l values of each clinician on the same factor, that both types of 


error are greater although the relationship between the magnitudes 
of the two types of error is indeterminate. It should be noted that 
inter-rater reliability estimates (Table 55 ) include only error 

of the first type (judgement error), since the same sources of 
information were available to all clinicians. This would be the 
essential difference in the interpretation of intra- versus 
inter-rater reliability estimates. With these different types of 
error in mind, let us examine intra- and inter-rater reliability 

in the present study. 

Let us assume for the present that an acceptable level of 
intra-rater reliability would be approximately .50. ia actual 
fact, the choice of any criterion value is always arbitrary 
representing a compromise between practical limitations and stati- 
stical desirability. With an R*l value equal to approximately .50, 
we would assume that roughly 50% of the variance of any single 
estimate of any factor represented true variance with the remaining 
50% being due to error of various types. Although the choice of 
.50 as a criterion value may appear somewhat lenient, it is 
realistic given the differences between statistical and practical 
significance vis-a-vis score assignment differences previously 


discussed. 


If one examines each clinician's R*l estimates across all 


xot wol sas coulev IAS pode etede 
neowted yiltdsiasy sidnrebieion at 6 | pints hs 

40 esqyr dtod tédf ~rotost ompe icone 
spies eee sdt asowtsd qidanoitelet : edt sguodtin aotse rg SS fOTTe | : 


tedt boton od blyode ti) .stealmvetebal 2f sorts Go esqy7” owt sd Zo { 
gouse yido ebuloni ( &2 afdsT) eetamites yt iftds tien weter-retal 
to eenwwoe smse sit conte ,(4osrs tHemegbup) eqyk tari? sdt Yh! 
eit ed blyow eraT -siteiotat fo Lis ot sicelisve evew nobtesrrotat . 
avetevy -sitini to cofteteaqdetnar Sit mt oonetstttb isitoeseze 
Yo eegyt tmecetib sac? uti .eatemites yiiibdsifen setea-aesat 
wiilidsi len saptcnritectah bas -sitei ocimsxs au tel , bate, ob mone 
to Level efidistqeons ns tedt?-tasesig oft tol smuzee cu ted Oe" 
fevtog sl .0&. ylotsmtxorqgs sd biuow vtiiideiles veseq-ertnk 
yisitidis eyswis si eulsv aokastico gets to solodo _ om 
-itete bas enoitstimil lssitosyq noswied odie & ghtzaseentor 


b i 
| 4,08. yistemixorags ot leupe sulsy I*A os dtiw aditidentesd fests 
eigaie yas to soasixsy sdt to 802 yidguet teddy savees Binow ew 
guimigmas edt diiw soasixev surt betmeesaqer YoRos? Yas To etemites =~ 


to gokoro sit dguoisfA .anqyt auolxéy to xomm® oF oth acted $02 ; 


ei ti ,tivine! telwemoe xseggs ysm sulev moiastieo' =e es 62s 
Sgptsasesy bas issiteitate neowted eponaxattib dt aevig whveitteet” 
iil Nit soomsnohib tremmgiaes scope. pectin cc \ 


103 


18 factors, we see that there are 12 factors where at least one 
clinician obtains an R*l approximately equal to .50. These factors 
are Factor 4 (self-starting work drive), Factor 6 (leadership 
force), Factor 7 (self-reliance), Factor 8 (adaptability), Factor 9 
(potential for growth), Factor 11 (management level planning and 
problem solving ability), Factor 13 (efficiency of application), 
Factor 14 (self-confidence), Factor 15 (supervisory effectiveness), 
Factor 1 (intelligence), Factor 2 (common sense), and Factor 10 
(readiness to learn). In several cases, two or even three clinicians 
obtain these criterion R*l values for the factor noted. Factors 
where no clinician achieves a criterion R*l value are Factor 3 
(oral communication), Factor 5 (interpersonal effectiveness), 
ae 12 (general energy level), Factor 16 (autonomy), Factor 17 
(responsibility), and Factor 18 (general suitability). For these 
factors, both types of error would be considerable. 

As noted earlier, it is tenable to consider that, for the 
factors where one or more clinicians obtains an R*l value approxi- 
mating .50, the difference between this value and the R*1l value 
obtained by the other clinician(s) on the same factor is comprised 
primarily of clinician judgement error (type 1) rather than essential 
differences in the levels of ability measured (type 2). Each 
clinician is availed the same types of information about each of 
the candidates to be appraised as is every other clinician. 
Therefore, errors of the second type would be presumed to be a 


constant for all clinicians; possibly large, but still a constant. 


emo teeel ta o%tedw exotos? SL ae Te aot: 
anosst saertT .02, ot Lenpe yis , 2H oe eet fo nel 
qidexabsel) 3 cotost « (evixb Hrow gig nseebl 

@ sotdet .(ysilidesqsbs) & 10foeF , (oonsifer-thee) V rotor. | 
bas goinasig Level taemegeasa) Lf qotosi ,(dsworg tot er ae 
_(nottsotiqus to yonsivitte) EL sotoet , (yr biids gnivice maidesy : 
_(aeomsvitestie yieriviaqua) Gf aotost _ (goushitmoo~-tise) HL webu: 
OL wotosi bas .f92 19> dommos) © aotosl ,(somepiifetar) D-wotes7 


sieioinilo sexdt move to ow? ,28a6o Leasvee ot .(icpel of eeemthsex) 


nhs 


evotosi .beton rorost srft not eotlay L4H nolaetigo aed, Adages 
6 yotosT ore sifev [#4 notastixo s esvotdos nptaiaiic es: onidia 

_ (e2zenevitostts thitenaseeeata notpsa (aoiteokenmmos te) 

VL totss7t .(ymonoius) 8f sofosl ,(Level yarsae fsseasg) SE 01987 
seont tol .(yvtkiidssivue’ tenant QI xojced bas , (ati Liddamogeos) 
.alderebienos od) biyow aotts Bo eeqyt dtod -etesast 


= 


sit sot ,teds rsbiesoo ot eldsnet eat Fr .metives bevow eA. yr | 


~ixougqs oufev [41 1s enlgddo enstoinilo axom Yo aio saps axor0Rt , 
; 
sutsv L*f edt bas sulsv aids noowied sonswet3es odt .02@. gatssa 
bezingmoo 2i dotost erss eit go Cejastoinaiio vedio sat vd beaistdo a 
isitnsees nedd setter (£ aqvt) cous gastesbist asiciniln 20 ylbssmbag | 


fosd .($ sqyt) bewuesem ytilids to elevel edt al eeamegeTzib 7 
ale 


to dose tuods moitsmroink to zaqyt 9mez ott belisvs et sate os 
_ 


6 od oF bemzexq sd bluow sqyt baoose edt to stor. 


.tnstanoo 6 [ftte tud ,syqsl vidteeog jansiot 


Pet) 


If one clinician is able to obtain an R*l value at a criterion level, 
it seems likely that the other clinicians could have also obtained 
that level save for their additional degree of clinician error. 
It should be recognized that, even for a clinician who obtains a 
criterion R*l value, his ratings still consist of some portion of 
both types of error. 

Another indication of the contribution of the two types of 
error to the convergence indices is in the relationship between 
*1 values across clinicians in one rating condition (inter-rater 
reliability; test condition). As noted earlier, inter-rater 
reliability estimates suffer only from the first type of error 
whereas intra-rater estimates include both types. If the R*1 
inter-rater value is high, but yet all of the R*l intra-rater values 
are low, one would presume a fair measure of the second type of 
error (trait difference) is present. 

In this regard, we see that Clinician #1 achieves a criterion 
ReDSvalueten Factors 1je4gre, 7, o.. 1, and 24, <or, one7*outrorprne 12 


No 


factors on which any clinician obtained a criterion R*l value. 
Clinician #2 obtains a criterion R*l value on Factors 4, 6, 8, 9, 
lO 11, or\on 6 out of the 12°factors: ‘Clinician *#3°obtains a 
criterion R*l estimate on Factors 2, 6, 11, 13, 15 or on 5 out 

of the 12 overall factors. Mean R*l estimates differ only slightly 
between clinicians: .41, .37, .38. 


Factors where one or more clinicians do not achieve a criterion 


R*1 value, but where at least one clinician does, are, for Clinician #1 


104 


TOTS getoiaits to: osigeb £ 
6 entstdo ofw asioiailo 6 10% asve (49d B 

to soltaog smoe 40. taienos ridse eyotsen obi sie 
—" | 

to eeqyt owt sfx to noitvdixtneo sdf to: nctpasbest 4 
neswited qidenottafes ot at ef asotbot somegtevaos sit >is 
asted-1etntt) aoitibncs goited sie itt atetotaiflo eeorsn 2 
yots1-rwtat ,Telirss patontek . (nok e803 reer ; 

en to eqyt tatlt 9dt wont yino retive eetem 

ie4 sdt TT ata? diod ebufodk estemites sasteq 

esylsv tetey-satni fA eft to ffs say tud ,iatd ef exer 192 
to sqvt baodea sdt 40 oquasam Fin & Sinus? rata 

| uaeesh aaa weItib 1 rere 
iv 
z 


al 


noizetiao s 2evettios .% aeiofails dont a52 en eae a 
CL sft to tuo Tao Pie (oars bers. eel ot. Ta .¥ «i artesian’ f a8 wl 


- 


,eulév L*h noitsiir s bettietdo asiotailo wae dolidw a 


' 4 7 ir : ret . 
<2 8 <8 ,H enotosy ao bree bea aoivati*e s eee nko: 


[ot 
rf 
6 enistdo Ef metoini to 2703989 Si orld 20 Su0 2 a9 48 wa 
$yo @ noo 2f ,f1 <ff .@ ¢@ exdtosi oo Steitites LN obwetits 
vitdyil2 vine wettib 2stemites 1a pet exotdst itensve SEs 


105 


Factors 2, 8, 10, 13, 15; for Clinician #2: Factors 1, 2, 7, 13, 
14, 15; and for Clinician #3: Factors 1, 4, 7, 8, 9, 10, 14. 

Let us examine some areas where low R*l estimates were 
obtained. From what has already been said, we see that interpretive 
problems were of two main types. For individual clinicians, these 
would be factors on which one or more clinicians did not achieve 
a criterion R*l value even when a criterion R*l estimate was 
obtained by at least one other clinician on that same factor. 
Interpretive problems for all clinicians collectively would be 
factors on which no clinician achieved a criterion R*l estimate. 

Interpretive Problems: All Clinicians Collectively 

It was previously noted that R*l estimates were below 
criterion for all three clinicians on Factors 3 (oral communication), 
5 (interpersonal effectiveness), 16 (autonomy), 17 (responsibility), 
and 18 (overall rating). With the exception of Factors 17 and 18, 
all of the factors noted above are of the interpersonal, oral 
persuasiveness type (Appendix 1). It may be that these interper- 
sonally oriented factors are only poorly or differentially appraised 
by psychometric and/or interview means, a possibility raised by 
Hendricks (1969). It may also be tenable that, since several 
tests or subtests are integrated by the clinician in rating any 
single factor, differences across test evaluations on the same 
candidate are of concern, a possibility which would explain Little 


& Schneidman's (1959) study. 


The error inherent in the low R*l estimates for all 


EL «SS ef etoto8T :8 seb corsienied at ier 
(AT (Of .8 Seat ce?! ft sueabed 108 astolanlo > =. 
oxvew cetsmites L*1 wol s1siw esoms noe snimexs ou 30 isle an 
evitexqreiai tedt 992 ew . bise ased ybsevis 25d Tstw mort -bakesdo- F 
seedt ,enbioiaiio Isubivibsi sol) .eeqys nism oW> 20 sqew emaidesq 
avefioes toa bib enstoimiia ssom x6 eso doidw mo erotos? ed biluow 
anw stamites L* aoiastixo 6 medw neve ouisy i*8 nokxetino s° 
srotoset amBe goat no astotatio vedto sao tassel ts yd Bbenkatdo , 
ed binew Vissizostioo eiteioinifo Lis rot emsideug eviterqretal 
-etsmitze £%9 aoinetivo 5s bevetdos astolatio on doidu mo svotost 
vlevizselio® easiotathy LIA :emeldosd qvizonmeeg | a 
woled Ae sptsmisves LAA Fett beton vievoiverg esw IB 
.(aoitsoinaummos feito) € exotosi ap ainstoinilo ssandt Lis wot setasnieto | 
e(ytilidiesoges1) Vi Cgetomosade) 3L »(eesaavitostis Senaeisadietintlbel: 
8 boa Vi. atot0s7) to moisgqsoxa ens dtch .(pitten Lisveave) 8£ bas 
[sno . Isnoarsorstat oft to ets svods fetom erodort eit to Ifs 
-vogyetmi sesit tert od vem +i .(L xi baeagh) eqyn eceriabeimendan | 
bezis«qgé yYlisitmersttib 10 yfrocq yine ets e1otos? beiveiae yilsace’ 
vd boetsx yttildieeod 5 2am oivkadnt so\bas vintemodoyeq yd ~ 
im 
Lexevee sonte ,tedt eidenet od oefs yam +2 - (280L) eabvixbask 


yg gnttsy at mpioinils edt yd batstseiat $46 eveetdva 40: @taet 


gape eit mo anoiteuLeve test seonps seonsasttib ,votos? eipite. 
elttid oteiqxe biwow doidw ytilidtaeog © ,mteoe> to ets stebibass ~ 
wybute (022L). 2! mambisadee 3 i 
Lie to% entemites I*h wol edt ci tndaedai aortas eft 8 
yy. 


clinicians on these factors might be thought of as reflecting more 
appraisal (type 2) error rather than clinician (type 1) error. Real 
differences in the levels of ability may have been present which 
would require that a "perfect" clinician obtain a low R*l value in 
order to reflect this actual difference. 

Factor 17 (responsibility) and Factor 18 (overall estimate) 
are the two other factors on which no clinician obtained a criterion 
R*1 value. The problems in interpreting these two factors are 
similar. What is being measured varies from condition to condition, 
or, in the case of Factor 18, within conditions. In the interview 
condition, it is logical to assume judgements of responsibility 
were based on past performance and quite possibly interpersonal 
persuasiveness. In the psychometric condition, personnel tests, 
which largely measure personality characteristics, were used. 

The differences between what is seen (interview) and what is seen 
to be seen (test) could account for this difference. 

With Factor 18, this problem is complicated since clinicians 
noted that they had difficulty in separating their evaluation in 
terms of suitability for a particular job versus their evaluation of 
suitability in terms of all candidates seen. This difficulty is 
reflected in the low R*1 values for all clinicians, particularly 
Clinician #1. 

Interpretive Problems: Individual Clinicians 
Clinician #1 


It was previously noted that Clinician #1 obtained R*l estimates 


106 


feet pres Toa ys) mploiniss 1 Tints 
dots Mea! me avert fees y yalani a 
at ouley I*A wol 5 aisido ee gan Ss oe? 
Bears hchak hie ae S ot : 
(etsmitzs Iiseevo) 31 totos? bas (ti Lidtanogas) a 
gofuetize 5 bentstdo astoiciio om deidw no arotos® xastt0 om eos 


ave 2totost owt seerit gaitetqistat ai ameidor dT cndsueiian | 


; 


.toLftbaos ot aoitibnos mort esitay bautensm gored of toae washimbe » 7 
wolytetai ot Bs . ator tibaoo nidtiw . 8f sotosT to aia sdt ai yx * 
yti itidiemoqass to etneyeeey emazs of Isoigod at at pnolsibacs a 
Lsnoeregzetal yidiezoq situp bas sonmenmoiisg saeq ne bowed orow A 


aa 


, 2sSeo7 Sennoeted fois thaios oiatamodoyeq oat at ae : : 


L 
ae 


.beeu adsw , 2otteiastosisnis ytilenoerteg | siuasem vlogrsl foldw 

mese ei sailw site (wsivyetet) meas 22 a cial coonsnettke edt 7 
aan iy eins rot tayooos Bbfeoo (tees) agoaed of a 

eqploiniia ssake yey Lqmos et Paes aist ene: ‘totosT inet 
at moltsuleve viedt gattsxseqes at ytiusIARkb bd yous une? 


2i ytiotitib eid? .nese aedsbibnss Lis’ to eamret mt 


j 4 
yviwsivolinsq ,emsioinifo [fis tot eepisv [88 wol edt af bets 


anstoimit) Lsubivibal :eme 


( Lhe a 
eetemitas L*H benigtdo Lh nsitotmi [9 asdd beton Lewol vert 


107 


below criterion on Factors 2 (common sense), 8 (adaptability), 

10 (readiness to learn), 13 (efficiency of application), and 15 
(supervisory effectiveness) even though at least one other clinician 
obtained criterion R*l estimates on these factors. Since at least 
one clinician does obtain a criterion R*l value, it seems likely 

that we are dealing with increased clinician error (type 1) on 

these factors. On examining these interpretively difficult factors 
in the light of the factor analysis already described, it is reassuring 
to note that they are spread over 4 of the prime 5 factors. For 
purposes of interpretation and evaluation then, this would appear 
better than if these were clustered within one prime factor rendering 
this factor oe for prediction purposes. 

Once again, an examination of the raw data reveals that Santich 
is a candidate ranked differently than one point between rating 
conditions. It may be that the measure R*l is too sensitive given 
the meaning and purpose of the ratings. 

These factors all have a common description (Appendix 1) 
in ee they are concerned with applied, concrete operations which 
may be difficult to assess in an interview setting; i.e., prediction 
of on-the-job applied skills. 

Clinician #2 

As indicated earlier, Clinician #2 obtained below criterion 
scores on Factors 1 (intelligence), 7 (self-reliance), 13 (efficiency 
of application), 14 (self-confidence), and 15 (supervisory effec- 


tiveness) even though at least one or more clinicians reached 


astoiails. santo Spite 16 cigars 
tessi ts sonte .20tas? setae enri s06 108. 
yiotkl emsee ti ,ouisv L*8 notes 5S aietde 2 
ho. (i sqyt) yorrs. aeioiaiio baesarort algae edhiot 

anotost tiuolwaib eloviteiqvetat Seed? ant rine. 8) aiiseidanl 7 
gainueesss 2i ti ,bedixoesb vhse1ls ebaylens totost ens 20. vat ata 
AG? -.8yot9st 2 smiaq orlt Zo W Geyo bserqe sis watt? seid ofom et 
aseqgs bluow eins .ascr aoltsuleve Bas oo LsRvagta Tee 20. seeoquig : . 


ashaae tofost smixzq 210 aidtiw bevateulo exew eeedt we aedtt netted — : i] 
i +1 


‘ 


Sc 
~ 


POR, 


gti 
.sosoqiug, foirsibsig Jor svomitey totost aide 
mobiee tedi elesvey EisE/wex gt io) rod tisk ek a6 -Gisgs-eon0 6) ‘ 7 
sites aeswiad Infroq sno GBT yitcatettib beast ovsbonieniata: 7 
fevin avitiéage oof at (84 axtbessm eds ted> Sd yen sao bao 1 [ 

a ; 
»oyakten eft 46 szogrug bits aalasem sit “ 


(L xkbasegh) rtottyi ess dommes evsi Lis e1otost seesT ak 


dstdw anetteasqo statonoo J beiiqqs aviv benngono a ae : 


mottotberq ,.8.1 ;gnttcsa wetvistat he ni szeees ox siusttah on 


-2flide, bel fags: stem 


notietix woled bssisido $% asisiatlo woiless ‘bs7540) 
yonpioitis) EL shoes fewnese) Teer 

-o8tt9 wiosiviagua) 2 bas» (sons | 
beso sagt ertont te oo sas 1 


108 


criterion on these factors. Type 1 (clinician error) would be 
presumed to be higher on these factors than it would be on those 
factors where a criterion R*l value was obtained. 

As was the case with Clinician #1, those factors on which a 
low R*1 value was obtained are spread over most of the five prime 
factors isolated by factor analysis rather than being clustered 
wholly within one prime factor. Factor III (resourcefulness), 
however, does contain two of these low R*l factors which might 
render cross-condition spedacetone rather tenuous. From an obser- 
vation of the raw data, it is also apparent that seldom does an 
actual ranking difference between rating conditions exceed one for 
any of these factors. This further reduces the risk of actual 
differences in behavioral predictions based on numerically assigned 
differences between rating conditions. On further examining these 
factons “in*therlight of the definitions given in Appendix 1, it is 
evident that they fall into two general areas; intellectual ability 
and independent self-directed work style. 

Clinician #3 

Clinician #3 obtained below criterion R*1 values on Factors 1 
(intelligence), 4 (self-starting work drive), 7 (self-reliance), 

8 (adaptability), 9 (potential for growth), 10 (readiness to learn) 
and 14 (self-confidence). As was the case with the other two 
clinicians, the low R*1 factors are spread across all of the five 
major factors isolated by factor analysis, with the exception of 


Factor II (potential ability) which includes three of them. 


6 dow ‘co soak sear 


Da 


a ae pac 


aming ovit ony 4o teom Sati : emge s ie 
berateufo ynied. ai “sites ah ayis 


7 


=f) BT _ wsoset 9 


i 


tdgim dotdw etovost [asf sate to ait tHe ‘eve ott io 
doa a 
-Ysedo. m mov! .eidourret ~oiiter enertoi berg cae 

<n : 

is 2006 mobiea tnd tnorsqgs obie at Yi ote wer oft 4 pre 
: vie 

101. sto ‘eka enottibno>. aocdst asewred sah aa 5 
Leusoe to Agia ott sooubse yentayt etHT _ aot 

be” i ni - 4 : . x _ ™ 
bonatces yl iso hasnun ao beesd emobtoibs rq Ussotvadod oi 280, 

tad ; 

scodt gniaimszs sed?qut 10 -anotdbno> ei ahold 28 Off 
et +h, xtiiteqgA ai mevig | ate 30 tipi hie ae 


. (neat uteotiodes): TIT 


-_ a 5 a 


yilids (sutosiloetnt, ;2sets tevansy. ost oftrt i fst odd | 


7 
ete i eeniga re ad) : 
ny ok a . pioin ey 


£ adotdost ao 2sitisv [*9 sto bane hes woled panicle a 
-(onngilen-tfise) ¥. .faviab axow geisnere tae) 44 3 titles ; , 
(nteel os e2enibser) OF , (dawory b6 lai cies te -_ * 
ows tedto ed? Wd iwsenss ott a 

svi2 ond to (fs cxouse bessge Sag 
26, q0itqadxe aa diiw nei 1 ae 
malt Yo samiit: 20k saat gi 7 t . 


109 


Commonalities 
SOEMOMaALLCLeS: 


When looked at individually, it seems that clinicians had the 
most difficulty in rating convergently factors concerned with 
inner-directedness, work style and application, future potential 
from the perspective from learning and application, and applied 
intellectual problem solving. 

Conclusions and Indications 

1. Convergence across rating conditions is a far more elusive 
standard than is convergence within rating conditions. A comparison 
of inter-rater reliability indices with any of the intra-rater 
indices shows this very clearly. Logically, this is so because 
of the oye different types of error discussed at several points. 

2. With no exceptions, the most reliable indicator for 
purposes of analysis or prediction is the simple arithmetic mean 
of the nee independent ratings for any given factor. In most 
cases, this raises the reliability index by .20 or .30. 

3. On looking at the similarity between petined across 
conditions (F and R*1), one can seriously question the value of the 
interview technique as an evaluation tool. Combined ratings, which 
are the ones actually used for prediction in the organization, most 
closely resemble those of the test ratings. Interviews are 
expensive and seem to contribute only inconsistency; this being 
aside from their obvious public relations function! This is in line 
with Webster's (1964) findings. 


u. Differences in clinical decisions made by individual 


vt 


a ti .f 


noeinaqmos A .znoiiibnos sae is tiw sotegreynoa laideamme ace 
tstsy-sital edt to vite foi aostbnt ytthidsi iter ntetey~tetai 0 

seussed oa 2f alrit viisotaex wyinesio yay ekit, mea: aeaene ; 
,etaiog fsvevee 45 bot Seuvbets “spre comet jrewsttib owt edt 20 4 
Ot rotsalbat $idsifet tae, _ Beto itqeoxe om dpiw amr 
neem attend inxs stamte eax ef itd so etey leas ee fs | 
+2001 Fe -19f96% fevix yas oot tad Jeabnogyba senda ed 20, 
.08, 76 0S, yt xsbab epi tedstios oft cockst wit jambne 
azovos eaaijet asented odinsbinie ent ts goldcol md. Ge i - 

eit to suisy oft noitesup ylevuoitez ns55 ano (5, bas 1) crotthbmoo ¢ 
doidu ,eynizss benkdmd Loot nobtesisvs ms os ovoiadoss wakwaqaak — ~ 
teom ndEtésinppy6 ady mi a Se cal sono tt ons : 
ois ewivietal .egaites. test exit 2o stort oftinecon yhseote 

guied etdt :yometeletoodt no. cai 


enti ai ai aid? taatdaqith anoltslet dug evoivds wient t sbtes 


2 


Leublvibai yd bem ea 


110 


clinicians vis-a-vis prediction cannot be discounted. It is necessary 
to look at, not only how well a candidate is predicted to perform, 

but who is making that prediction as well. If one combines the 

best ratings of all clinicians, the power of our "super clinician" 

is tremendous. If one combines the worst ratings. . 

5. Reliability, as it has come to be referred to in the 
literature, is not an adequate construct to use in comparing 
ratings across conditions. Even though we know that we have two 
distinct sources of error, we act as if we have only one by 
clinging to a traditional conceptualization. 

6. Although clinicians tend to look at the same general 
prime areas for purposes of personnel evaluation (factor analytic 
interpretation), the differences are in the weighting of these 
factors for decision making. 

7. Differences in mean ratings across conditions (high F) 
do not necessarily lead to differences in reliability (R*1) or 
vice versa. The consideration of either aspect singly is folly. 

8. Factors of an interpersonal oral persuasiveness nature 
tend to be differentially rated by all clinicians. 

9. As noted by Bray & Grant (1966), all factors are utilized 
for decision making but some contribute a great deal more in terms 
of weighting. 

10. Most of the key characteristics can be evaluated by 
interview (Grant & Bray, 1969), but that evaluation is often diffuse 


° ° \ ° ° ° Q A 
and differential vis-a-vis more "objective" criteria. 


Wreseson 22 31 econ ad 


 mroined ot beratbesg a S280) 
sit eentdmoo sao 21 pee: tie betg 1 
“neisiails neque" uo to WwHWOg nitty te ait tee i 


" 


* e 


. . sepaktss tevow edt ‘genkdmnoo ano 22 | BEC +: 
eft mi ot bexietsa od ot afnia, aaa tl $< 136 rELidet LA zee 
qnissqmes ai sev of tourtena ren aiiea ne ton 2k corwrenesit 7 
owt eved ow tect worst ow danvodt nevd . etott threo 220796 ognizer Hi 
ye oo yino sved ow 2h a5 toe ow, torte to. esoduee sontinkb + 
eA ae en sacks thes 6 of patgatto : 


ie) 
Isusney omse sis Sf Asol pre £33) annlotaito davodtiAé «0 >) 15 


- 
sity fens ryojost) colteuisvs wa ee t0 a ei aul tot agers omitg 


seedt to poitdstow arts mi 3%6. ounce sacs _(agivetonguetnl — 

| .gnilbm doietosh 36% ‘Bx0toBt 

(1 agin) enoitibaos 220% agald as aéom oh eeomeusttid .T vy , > F 
vo (L%4) yeifidetion at ese cans ot bsol oe ton ob " | 
.yilot 2b yigate toeqes tadtie to moissisbieapo eaT ,seue8V solv 7 

oustea 2zenevizsueveq [sto Ismoesequstit ne to erotosl «8 ‘eeu i: is 

-eastoinifo Lis yd bets« vilsitastettts od oF but ny, 

bestifttuy sts etotost ifs | .( aaet) atte 3 3 veud yd bstoa BA - donee a 


eirsey oi sxom Iseb tsous 5 atudintaos enoe stud gnicita icine 7 
io _ 


* ite, * ar 


yd bstsulsve ed eo cottetaasaaaeie yor. oda to z 
earth, nerte at Ped etiaba: ‘sed sud ede wend 2 tas 


mao 7" er ; ered 
Loe ‘a te ttt 8 ee cage es 


pb Ee 


ll. Even though clinicians differ widely in the amount of 
experience they bring to this study, there is little apparent 
difference in level or style of decision making. This is in accord 
with Goldberg (1970) and Stricker (1967). 

12. Sawyer (1966) may be correct in describing the main use 
of the interview as in providing additional, non-psychometric 
information to the evaluation process. However, when that information 
is processed psychometrically, it does not concur with other ratings 
of similar abilities. 

13. In all cases where mean differences between rating 
conditions were noted, test results appeared to moderate interview 
impressions when rating in the combined condition. 

14, Perhaps Holt (1970) was most correct of all when he 
said "...¢Glinical psychologists vary considerably in their ability 
to do the job, but the best of them can do very well " (p. 348). 

Suggestions for Further Research 

1. The most major suggestion must be in the area of predic- 
tive validity. Even though we now know a great deal regarding the 
reliability (convergence) of clinical judgement, what is the 
predictive validity of judgements made in a cross-condition rating? 
Which rating condition best predicts future on-the-job behaviors? 
This would, of necessity, be a longitudinal study since only 
approximately 20% of the subjects involved in this study actually 
became employees of the companies for whom they were appraised. 


Because of the difficulties of comparing supervisor ratings 


biopos ni eb aitdT -agnizsm nae yh sioaa one 
.(S8GL) tefotat2 bre (O°OL) guodblod dkw 

seu mism eft yitidiaoesb yi > ed yem (890L) toywee Sion vw? 
sintemoday2q-aon .Lstot Shibba aaibivors at ‘2s wolvietat sft to 
ttoksenriot aL sift natw ,asvewoH .easag%c sotisulave sd2 of noitsaxotat | 
egntts: teito ftiw swome> Jon 290b ti Vilsoleaemonsyag beesesotq ef 
.eeitilide ssiinke to 

gnitss asswied essrevettib nsem ovsdw 29250 (fa al vBL «> 
woivaatnt. sfsisben of Dsxsaqqs’ eidaizes Tees ebetorn engw anoitibaoo 
_.noltiivos beantdmos sd+ cl golte: aedw anokezengn 

ai modw [fs to toerxos teom sew (OVEL) tioH asqated wre PI 
ysilids tieds ai yidetebienas yasv ate tgolorsyeg feotak(s. 2." biee 


.(88E .q) “ Liaw yrov ob aso med? to teod-sit tud sdot silt ob ot 


isasseor cede ayietoF snoiteaggue i ee 
4 o¢ *~ - * 
-oibeva to 5ate Sdt gi. sd) feta notteagyun, +o fem" taom sat. onl4 


ont goibiwsys: [e396 eens worms wor sw dguott aeva  oythbhisv eviv’ 
Sit ei tedw {Ioomantut Lestakto tc (sofegravne>) votiids tier 
Ssnitsr sottibdeo=eeot § nt Shem ataomaghot te yr rbiisy avijoibesq 
Ceroiveded dof-sdt-ne aw-tut etoibsaq tesd folgthres sattex dokiv 


ye 


vino sonrre bute Vent hasta hdl 6 of ,ytieesoen to ,bivow akdT 


yiisviss yoete aids ot beviovat eTostdue att to #08 x Le temks . somqqs 


“hooters S19 yout caitte 107 aelasgnon, att Yo rosvotgne sane 


= 


, 


agai ts: seatvisave pebrqe to bined aeliaatcmin i! 7 


112 


(criterion) with clinicians ratings of future performance, it would 
be hard to maintain the same degree of ecological or external 
validity attained in this study. 

2. It would be interesting and worthwhile, if any predictive 
validity study were undertaken, to ascertain the factor or factors 
(of either the 18 by 18 matrix or the 5 by 5 matrix) which either 
singly or in linear combination would best predict future performance. 
This study would be plagued by the same external validity problems 
as would #1 above, but is very important research. 

3. Although there are small differences between the results 
from each clinician, clinician sample size is too small to make 
any generalizable conclusions. A larger clinician sample might 
address itself to problems of clinical judgement, especially areas 
such as amount of professional training and amount of experience, 
problems raised by Goldberg (1970), Stricker (1967), Borke & Fiske 
(1957) and Oskamp (1965). It may be that as much would be lost 
as would be gained in this type of procedure vis-a-vis external 
validity. If a large number of clinicians were used, many clinicians 
would be called on to do tasks that they do not normally do because 
of experimental convenience. 

4, Judgement simulation, while possible with the present 
data, was not a central aspect of this treatise. What is the possi- 
bility of linearly or exponentially combining single decisions 
in order to predict other decisions? Need we have a clinician at 


all or would we be better at simulating a clinician when he is at 


ee ea 


etuais tk _ysonsero8t90 ee 


epee : ale a aes ee 
Lankotxe no: LenkgoLoss. alc af hd oid ; eae et 
7 : 2 jekinnpe pags Ae 
evitotberq yas ti « aLidud ton) Bets patseorora Se 8 


rs a 
bs 
ajtotost? to wtost sd? distwoeh ot or ove aes ci 


wee 
vi 8 eit « ini no? 


30, vg aie 


i 
retizie doidw fei hein 2 yd 2 sft do xiathem 6 


<sonemiotisg 21 oe toi hems teed bisow nobasotise,anemit a 
emoidoxrg yirbtisyv, fenteixs emse onft yd saugaly: sdabitiven we ta 2 Et 
jiotes25% ig ttoqivt yiav at sud .oveds re bJ a OW 8 


adit a ae 


et+fu2sc edt agewts! coonsTettib Eisme sts etait smuitih 
— 

sisn of [ipme oof &F 98 ie algae jesiotndt3: sanioimite os 
tigim slqmee asloiniis teats & _ emo tayLoges oi ede 5 


eéo'ts yifistosdes. . TH nagbu; fsoinkiS to amet dorrq, ¢ 


0 
-soneixegxs to invems bok Be Sia eA <9 AS bee aren to os 
gtaii 3 ettem ,( Tall) «sts tate . OveL) apigh yd bowie sme dot 


ef) 


SzoL 6q bintow rbum as nd 19 ec yor pat at amped enh) At 
feaxesee eiv-s-2iv ewbesotg 40 eqyt sada ma bonisg od | pLvow eé 
2astointio ynem ,bseav anew < isiis to codimusnt saad: 


setissed ob yilsmuort gon ob vett tedt edess oh of ao bekke 


i -eonsiasvaoe Le naron’ 
we Kee ce. py 
-tases1g oft dttw eldizeog alinw. ookts Lumse teameghul . 6 
z bw nee 


~laeoq adt et tanW .seiteent aide to taoqes if sad 


_ 


anoteinsh peau gutaidags ated se vee 5.$ soe 


tS nmetoinifo & over ‘ow bail fawn ‘es 
“a 

ts eb ed PATE ee ii 

at rie 


ress 


his best or most consistent? Holt (1970) addresses these problems 
but not in any rigorous experimental sense. 

S. How might psychological trainees be trained to simulate 
or duplicate the decisions of our three "experts"? Would such a 
procedure be viable or desirable? Stricker (1967) would see this 
as feasible but what would be lost by such an approach? 

6. The two threats to reliability noted in this study merit 
examination from conceptual and practical perspectives. 

7. Replication of this study (or aspects of it) on a non- 
industrial clientele would contribute greatly in the area of 
generalizability. 

8. What is the generalizability of the factor analytic 
combination of the 18 by 18 matrix? What is the effect of feedback 
about your own decisions on future decisions? 


9. What is the effect, in terms of actual behavioral pre- 


diction, of ratings differing only by one point? Are our statistics 


too powerful for our procedures? 
10. What is the cost effectiveness or utility of the various 
approaches? What is gained by the three approaches and Sean 


worth the price? 


113 


‘| elt ? 


6 dove bluoW sVatreqxo" ae 2 to neste 
abit see bivow (TS@L) reAdbtte (eldentaet ve ade 
Siosot4qs as dove vd og od bivow tedw aud of 

tivem ybute eidd at boton ytilidsifes of anlage ome oat pene 
.zevistseqexed iscitosiq bas teuiqaastes sort oi seni . 

-non 5 na (ti %o etosqes To) xbute etit to gol teokigeat seu 
to sors oft of vitsetg otiidiadces blrow elotsekis tstvseubak Si 
-(iiidsstisrmeg a 1s 

oitylens yotos? oft to yiitidssiisvenss sit et dadW wae fe 2 


Nosdbsot to tostis ads 2b tedv Sxbatom BL yd @L ont Yo aa 


a 


yy 


Ssnolaioab basis ao enotetosh awo woe rude ‘a, 
tA 


-e1q Istotvsded [eutsn to eumtst at ,jo0%te edt si tedW eo * A 
eottattsite wo eA Tiniog sno yd yiao oii ett eenibee “yo ;aotiskb 
pte 
= 


guolasv ot Yo ytilisy so ezemevitsetis tees sit 2i tad mr>icee 7 
t 7 


Sesaubspo1q wo tot 


+t at bos esdosorqqs sends edt yd bentsg et tenW Sex 


REFERENCES 

Albrecht, P. A., Glaser, E. M., & Marks, J. Validation of a multiple 
assessment procedure for managerial personnel. Journal of Applied 
Psychology, 1964, 48, 351-360. 

Ash, P., & Kroeker, L. P. Personnel selection, classification, and 
placement. In M. Rosenzweig & L. W. Porter (Eds.), Annual review of 
psychology. Palo Alto, California!’ Annual Reviews Inc., 1975, 
481-508. 

Baskett, G. B. Interview decisions as determined by competency and 
attitude similarity. Journal of Applied ey chology CMenS. Sik 
343-345. 

Biegined s,.htkins, A.,) Bratal, Sis Leaman, ke Miller Hee) 6 Inmipodm, —- 
Clinical and social judgement: the discrimination of behavioral 
information. New York: Wiley, 1966. 

Blumenfeld, W. S. Early identification of managerial potential by means 
of assessment centers. Atlanta Economic Review, 1971, 21, 35-38. 

Borke, H., & Fiske, D. W. Factors influencing the prediction of 
behavior from a diagnostic interview. Journal of Consulting 
PSY CNOdO RY salou peda oer, 

Bray, D. W., 6 Grant, D.-L. The assessment centre in the measurement 
of potential for business management. Psychological Monographs, 
T266., iO GN e Whole No. 625). 

Bray, D. W., & Moses, J. L. Personnel Selection. In P. Mussen & 


M. Rosenzweig (Eds.), Annual review of psychology. Palo Alto, 


California: Annual Reviews Inc., 1972. 


114 


> 
_ 


' 


bas ~noltsoltieesio: eaoksa0ie8 J fenaoetst 
to woivet [suonA ,.(.eba) tesaet 
a 


~evet «ont austvod ine tsi 
{ ? 


bas youstaqnonyd Dertaie seb Bh i saae +f be ac +f 
eVa ORCL ypokodaye4 baildqhs Je lisiiwoh - sca nega bie 
“~ 


oT ,ibsataT 3 ..H ,raliim . ot neato Ae _isient te maka. 


Sssoivariod 5 toldenimisoe ras) 


—Be-eo. is .fTel «Waivet, ditont pinto tooi ot Boma IA  ecistta sigan we 
to aoltothexd sit yqicius. fant etotosy? .W a oteit 3” Ore. 


gnitivenod to Lemnot wok vada oiteongsib 5 moat esi ie 
08-8. 1S a mot 


re Ga ©} 


3 noeauM -T of -floitoe foe Lennoeist ot ob a a ..W" ‘ds a 


,Os fA ofsf 


Campbe) ]....D. oT. wt Fiske. D. W., Convergent and discriminant validation 


by the multitrait-multimethod matrix. Psychological Bulletin, 

I95d. 06, Bi=105. 

Campbell. J ..T.,/Otis, J, L...liske, Rus E-scc Prien. Boake Assessments 
of higher level personnel: II. Validity of the over-all assessment 
process. Personnel Psychology, 1962, 15, 63-74. 

Cronbach, L. J., & Gleser, G. C. Psychological tests and personnel 


decisions. Urbana: University of Illinois Press, 1965. 


Dawes, R. M. Slitting the decision makers throat with Occam's Razor: 


115 


the superiority of random linear models to real judges. ORI Research 


Bulletin. 1972, Vol. W2ueNo. Ice 

Dawes, R. M., & Corrigan, B. Linear models in decision making. 
Psychological Bulletin, 1974, 81, 95-106. 

Donaldson, R. J. Validation of the internal characteristics of an 
assessment center using the multitrait-multimethod approach. 
Unpublished Ph.D. thesis. Case Western Reserve University, 1969. 

Dunnette, M. D. The assessment of managerial talent. In P. McReynolds 
(Ed), Advances in psychological measurement (Vol. 2). Palo Alto, 
California: Science and Behavior Books, 1971. 

Einhorn, H. J. The use of nonlinear compensatory models in decision 
making. Psychological Bulletin, 1970, ape 2 = 280 

Einhorn, H. J. Use of nonlinear, noncompensatory models as a function 
of task and amount of information. Organizational Behavior and 


Human Performance, 1971, 6, 1-27. 


atosingnen2A «1 a. ebad 3 -. + 8 
goemaeocas Ils-12v0 eft to ytibitev ‘nt 


.@00f ,2eex3 efonilll to yiiersvin : :sasda + 

-aosisd e'sos0 ddiw tecwdst esevem notetosb edt gba Lie MR — 
dompeseh U0 .esgbut ser or aisbom neentl mobrax to vil obeogig dt Ms 
ait 


££ oo SL . Lov .SveL ob teSE i‘ 


= notefoeh at elebom tsenid .9 es i i aMen 


ae to apttetastossaed> {enrstik i to 


.fesorggs, bodtemitiva-tts1its tum sat gale xSd130 


,edeL .ytterevia ovisesi mrsteeW e260 .ebzensd .0. bettie ae 
ae 


ablonysioM .¢ nl .taoist [etiegsnsm to tremeesees oat .2 1M ged ud 
- a ? 7 i * ae 


-otfA olsd 


— isnoitssins ace) 


LLG 


Einhorn, H. J. Expert measurement and mechanical combination. 
Organizational Behavior and Human Performance, 1972, 7, 86-106. 
Erickson, E. H. The nature of clinical evidence. In D. Lerner (Ed), 


Evidence and inference. Glencoe, Ill: Free Press, 1959, 73-95. 


Ferguson, G. A. Statistical analysis in psychology and education. 
Toronto: McGraw Hill, 1971. 


Goldberg, L. R. Diagnosticians versus diagnostic signs: the diagnosis 
of psychosis versus neurosis from the MMPI. Psychological Monographs, 
1965, 79 (9, Whole No. 602). 

Goldberg, L. R. Reliability of peace corps selection boards: a study 
of interjudge agreement before and after board decisions. Journal 
of patie Psychology, 1966, 50, 400-408. 

Goldberg, L. R. Simple models or simple processes? Some research on 
clinical judgements. American Psychologist, 1968, 23, 483-496. 
Goldberg, L. R. Man vs. model of man: a rationale, plus some evidence 
for a method of improving on clinical inferences. Psychological 

Bulletin, 1970, 73, 422-432. 

Goldberg, L. R. Five models of clinical judgement: an empirical comp- 
arison between linear and nonlinear representations of the human 
inference process. Organizational Behavior and Human Performance, 
1971, 6, 458-479. 

Goldberg, L.R., & Werts, C. E. “The reliability of clinicians' judge- 
ments: a multitrait-multimethod approach. Journal of Consulting 


Psychology, 1966, 30, 199-206. 


ae ws : 

<aol erent 05. LBL 
a 4 a4 
-a0f-a8 «F ,Sver s = 
(ba) toned «0 nF 


.mottsoubs bis ibaa eq ai 2 Gare 


steometh edt teagte obtsongeib auetsy ieee Ad eal 


cigs tgonol f ‘sotgofodoyed TMM odd moet elecmuen apBIsY eheodoyaq to n 
. ($08 Lol sont ee) et", 2oer ; 


ybute & :2btnod moitosise eqi0d 89Beg to vokl ide t fest Ht -atedblod 
Isamyol .anofetoeb bysod 1stts bas srotsad snomeormas aera eee 7 


808-000 ,0€ ,aaeL cXpe fotoyst bet! 
ao doweseer sm02 Sesarso01g siqni2 10 elsbom sogala ‘ame . 
-as-f8y LES .gaet ,tabsolodsyed feoiaen . atramogbut issinito 
sonebive soe eviq ,ofenottes s tnsm 2@ Lsboat - ov ‘Bh ot: <d .at8dBL09 
Isotgofordoyel .«esonexsint Isoimito ae anivorgmt to bodtem 6 ot 
,S6s-oSH .BY (over _coisaeaaa ' 

=qniao Isoitiaqms as :sinemsgbut Lesinife to étobout wii .f @ id gedBL0D 
hemud sit to enoitsinesetget Bont Laon bre teemhl dsowted moses 7 
Jae kadibeal nemeH bas votvenisd I btesias 20 .22990%G unm, af 
eruubed 2 tee 
-snbut ‘enstotatio to ytiitdstiea edt .a .5 nde ee ene, a : 

f >. ‘iene i alae Si tit in hom 6 


zai 


Gough, H. G. Clinical versus statistical prediction in psychology. 

In L. Postman (Ed), Psychology in the making. New York: Knopf, 1962. 

Grant, D. L., & Bray, D. W. Contributions of the interview to the assess- 
ment of management potential. Journal of Applied Psychology, 1969, 
53, 24-34. 

Henrichs, J. R. Comparison of "real life" assessments of management 
potential with situational exercises, paper-and-paper ability tests, 
and personality inventories. Journal of Applied Psychology, 1969, 
505 uh25-432% 

Hoffman, P. J., Slovic, P;:, & Rorer, L. G. An analysis of variance 
model for the assessment of configural cue utilization in clinical 
judgement. Psychological Bulletin, 1968, 69, 338-349. 

Hollman, T. D. Employment interviewers’ errors in processing positive 
and negative information. Journal of Applied Psychology, 1972, 56, 
130-134. 

Holt, R. R. Clinical and statistical prediction: a reformulation and 
some new data. Journal of Applied and Social Psychology, 1958, 56, 
Le=12. 

Holt, R. R. Yet another look at clinical and statistical prediction 
or, is clinical psychology worthwhile? American Psychologist, 

1970, 825,0687=049°: 
Langdale, J.sA.5 & Weitz, J. Estimating the influence of job information 


on interviewer agreement. Journal of Applied Psychology, 1973, 


aiiaeo3227 § 


taomayensm to etmemepszas “stil Leg” to sasiagi) eee: a 
potest ytilids asqsq-bab-t9geq ,esalorexe [enokseutke dtiw fet a 

,eael ,yaotoroyed botiqgs 0° Leanvol asisotneval gtEeapere@ bas ’ 2) 

SEMeSH BE” - 

soneixev to ateylenb aA .O .d ,aetod 3 .: 9, olvote Gat  aeerttoH | i 

isofatls ni aottestiity ov Lemgitasn to t+nomeeoeas sd3 sot Lebom | a. 7 


Q feot S 
evitieog anieessoig ui etov1s ‘arswotvasd iat sous a 9 eawitol a : 
22 ,Sie!l .ygofedoyed bsttggh to Lemme’ Lscusou ool tgarrotnt ovissgen.bas _ 7 an 
PE L-OBL 
bas noissiumroter & :noitetberq Lsaiteltete bas en 8 ty SLOW . 
: 


-22 ,820f ,ygofodoyed feisec bas Dokl 


.Of-B866! .Pe .P3eL . ahve 


| ci oF 
noltotbetq Lsckteitsra bos Leotatio ts Hook rsijons tet 8K tL . i" 
et2tgolodoysd asptgemA Selidwatiow ygoLoroyaq Lspiatin oucceanl . af 
Leie-sbe. 28 ote 

noitsmmotnt dof to soneulint sft cle -L .Sti9WS exh “ene ft 
golodoy 24 boils gA to 4 sreusol | 


covets 


J 


118 


Linsett, .L. Selecting personnel without tests. Personnel Journal, 
1964, 51, 648-654. 

Little, K. B., & Schneidman, E. S. Congruencies among interpretations 
of psychological test and anamnestic data. Psychological Monographs, 
195957306. Whole Now 876). 

Mayfield, E. C. The selection interview- a reevaluation of published 
research. Personnel Psychology, 1964, 17, 239-260. 

Mayfield, §.iC., & Carlson, R. E. Selection interview, decisions; first 
results from a long-term research project. Personnel: Psychology, 
1999025 W153. | 

McReynolds, P. An introduction to psychological assessment. In P. 
McReynolds (Ed.), Psychological assessment (Vol. 1). Palo Alto, 
California, Science and Behavior Books, 1968. 

Meehl, P. Clinical versus statistical prediction. Minneapolis: 
University of Minnesota Press, 1954. 

Michel, J. 0. Assessment center validity: a longitudinal study. 
Journal of Applied Psychology, 1975, 60, 573-579. 

Moses, J. L. The development of an assessment center for the early 
identification of supervisory potential. Personnel Psychology, 
TOF 3.) 205 OOS O00 

Oskamp, S. Overconfidence in case-study judgements. Journal of 
Consulting Psychology, 1965, 23, 261-265. 

Perez, F. I. An experimental analysis of clinical judgement. 


Dissertation Abstracts International, 1973, 73-15532. 
MSC Lea LOU ee ea ee 


sniteleecheaee “RMOm 39) 
_enigstgonol Lpotg sborloved stb done S08 708 

he Ca¥ olf sah 
sedeiidud Yo morrsulevess 6 ~wotvmetat rotoatee' oat al =a 


.0aS-eES , TI #BCL », YRO 


te1it :enoteiosb weivystni moitosis2 a. A eer br a f n 


ysoloroyed Leanoete? . taefo1g fowoees mratytioL 6 pamper iy 
| Ch, +28 «' 


dal .tmemecsees Lestgolonsyeq of acttsuborak eb 
Sorta ofsd .(£ .fov) tromecgeap Isoigotodoy 2 he saad ; 


:aifogsenniM .moitolberg tsoltetist2 ane 


=~ 


bute Leaibusbgnol s tysibifev wonso ary ae -O°G ¢Lerios 


(OT2-EV2 .08 .eteL .veolodoyes 


vitnge sit tot ysinen tnemezs2en me to tremnqefeveb sdT 41 ‘ha® 


eypofonoyed Lonncettes . isirastog aire +0 eres 


to Leciotio’ -citnemeabut YbuTB- 9259 at sornebh er 
/CdS-+f8S. 9B), fats eXgs 


semaabut isotintlo to eiaulons = avon 
. teadicer Evel moisten ystol . at ots rredA solts: ra 


Ue) 


Sawyer, J. Measurement and prediction, clinical and statistical. 
Psychological Bulletin, 1966, 66, 178-200. 

Schwab, D. P., & Heneman, H. G. Relationship between interview structure 
and interinterviewer reliability in an employment situation. 

Journal of Applied Psychology, 1969, 53, 214-217. 

Shinedling, M. M., Howell, R. J., & Carlson, G. Another perspective 
on clinical judgement. Psychological Reports, 1975, 36, 383-389. 

Slovic, P. Cue consistency and cue utilization in judgement. 

American Journal of Psychology, 1966, 79, 427-434. 

Slovic, P., Rorer, L. G., & Hoffman, P. J. Analyzing use of diagnostic 
Signs. Investigative Radiology, 1971, 6, 18-26. 

Snow, R. E. Representative and quasi-representative designs for research 
on teaching. Review of Educational Research, 1974, Lely , 265-292. 

Spitzer, M. E., & McNamara, W. J. A managerial selection study. 
Personnel Psychology, 1964, 17, 19-40. 

Stricker, G. Actuarial, naive clinical and sophisticated clinical 
prediction of pathology from figure drawings. Journal of Consulting 
Psychology, 1967, 31,5 492-494. 

Trankell, A. The psychologist as the instrument of prediction. 

Journal of Applied Psychology, 1959, 43, 170-175. 
Ulrich, L., & Trumbo, D. The selection interview since 1949. 


Psychological Bulletin, 1965, 63, 100-116. 


Vaughn, C. L., & Reynolds, W. A. Reliability of personal interview data. 


Journal of Applied Psychology, 1951, 35, 61-63. 


esutouite weivretal neswred ie, cde pee 


.moltautte tasmyolqns as nt contr : 
TISHHIS ,£2 .2dCL Vyadtorioved Baseqee, fo f | 
svitoogeteg redtonA +O ,aoaiaso Fe = tom oe MM se 


soLosoyet 


7 


oiteomgsib to ger gaisylenA «tl . I ,meattoH 3 “3 ot corer 3 Prt i epivole 
a a {\quacnelaes auiteyis esval ee i 
se 
foxsezes Lot 2mgiasb evi jeIngeoTqeT~fasup fine ovitsttezorged .a a won 
S2S- 26S , ae ,HVEL , dorsosail issotteoubs to sive -gatdoses 0 
vbute Eas oates [sisegstism A .b .W -srsmaviot 5. aed etostige 
oor TE .#a0s ,gutoiayed Leanoana 
fsoiniin botsortetdqes bna fsoiniio evisa _febasotoa oO vio bse 


gaitivenmoD to Ieqwob -egakwerd omeit mort ygolodteq to noite ibexq e 


; #Oe-SOn IS .TaeL vokedaaes 


.fobsoibsta to Poominte nt odt 26 tetnofodoyeaq sit A css 


120 


Wainer, H. Estimating coefficients in linear models: it don't make 
no nevermind. Psychological Bulletin, 1976, 83, 213-217. 

Webster, E. C. Decision making in the employment interview. Montreal: 
Eagle Publishing Company, 1964. 

Wiggins, N., & Hoffman, P. J. Three models of clinical judgement. 
Journal of Abnormal Psychology, 1968, 73, 70-77. 

Wilson, J. E., & Tatge, W. A. Assessment centers- further assessment 
needed. Personnel Journal, 1973, 525 172-179. 

Winer, B. J. Statistical principles in experimental design. Toronto: 
McGraw Hill, 1971. 

Wollowick, H. B., & McNamara, W. J. Relationship of the components of 


an assessment center to management success. Journal of Applied 


Psychology, 1969, 53, 348-352. 


, 


totmotoT .sgiesb istasninsqxe ot 29, q xioaing 


a Bs ue a 
+9 stasnoqmon sft to gidenoitefsA .t .W .setemptoM 3, «4 oi dok 


beiiqqh to isaaol .22990ue tnemegsasm ot wadmeo - 2B 226 Sos 


_..& ivy 
S2E-8HE Ee 2O0S eygokadoyelt 


et 


APPENDIX 1 


Definition of the 18 Characteristics Used in the Study 


a 7 7 
f LP af : i : 


: > 
1 7 
ybusd sont ak beat aoiteinvetosipdd Bf oleh 7G anaes net 


APPENDIX I 
FACTOR DEFINITIONS 

General Intelligence. Basic general capacity to learn and 
understand. 

Readiness to Learn. The individual's willingness to acquire 
new information, explore new ideas, methods, tasks, etc. 

Common Sense. The degree of ability to reach quick, practically- 
effective decisions about uncomplicated situations where sound 
judgement depends primarily on accumulated life and work experience, 
established precedent and procedures, etc. 

Management-Level Planning and Problem-Solving. The individual's 
ability to recognize the full depth and breadth of situations and 
problems and to consider the longer-range, as well as the here-and-now, 
consequences of their change or resolution. 

Oral Communication. The degree of clarity and ease with which 
the individual expresses himself in face-to-face discussion. 

General Energy Level. The level of physical vigor and 
vitality the individual will demonstrate in his day-to-day conduct. 

Self-Starting Work Drive. The degree to which the individual 
characteristically keeps himself continuously occupied in work- 
related activities without need of stimulation from his supervisor. 

Efficiency of Application. The economic and productive 
organization and application of work time and effort. 

General Interpersonal Effectiveness. The level of effective- 


ness the individual demonstrates in day-to-day dealings with others 


122. 


. 6 a3 a 
bnb aisel ot yiissqso Letemeg Ohese « 


etivpos ot eesngnifiiw 2' tevbivibat edt asad ©. y 228 
.ots , atest ,ebodtem ,aasbt woe aad 
-yiissitesiq .Aoiup fiopst of wiltds te gengeb od? oo 


bayoe steiw saoltsytte bessotiqnooay ‘stints enotetoaly 


soomsinsqxe Acow bas stil beveluuluoos ao yibromiag abaeqeb wey 


.ote ,estubssoxq bas ey beahidetee 7 

a'feubivibsi edt .gaivfoe-meloosd ons gcionnls _isveid- ~ treats ge! st a y Ly 
bis anottevtie to dtbse1d bas itgph Liu? ed: sina ot cette 7 
~wor-bne-siedi sis ab Llow 26 esunsi-regaol ort nabkenos oF bas smoidong ; 
.fottulezey to syaeds abedt to: seomaupeseg : 


dotdw dtiw sese bas ytinsfio to setgeb sdT 


_Woteetioetb aost-ot-eont at tisemid eeesexaxe teubtvibat ont 


bre sogiv Isotaydgq 10 LeveL exit faved _ygnonl fexpme® me 


~toubnoo ysb-ot-yab ei ai stevtenomeb [Liw Isubivibat edt vv tiett 
* 


- 
7 
oy 


at 
: ; 
(tes: 


-lerow mi beiguaso yievounttmod tsanist alin 


-togivuedus ain gor) cottelimtte to beer, st coisas 


= 7 


' ‘al 
~ 
_ 

_ 

J 

f 


‘od? 
-avitostts to aan oft 228 nevis 


123 


with regard to gaining and maintaining their respect for his ideas 
and opinions, their confidence in his integrity, and their general 
feelings of good will. 

Self-Confidence. The degree of basic security the individual 
feels in his own ability to deal adequately with more situations 
and people he encounters. 

Leadership Force. The amount of influence and dominance the 
individual habitually exerts over groups and persons he encounters. 

Supervisory Effectiveness. The individual's habitual effec- 
tiveness in directing, co-ordinating and controlling subordinates 
in standard work settings. 

Self-Reliance. The degree to which the individual carries 
Out assigned responsibilities without Seeking direction, help, 
encouragement and/or reassurance from co-workers. 

Autonomy. The degree of the individual's need to make his 
own HES snes regulate his own behavior, be his own boss, etc. 

Adaptibility. The level of ability to cope comfortably with 
new and changing circumstances. 

Responsibility. The degree to which the individual lives up 
to personal, professional and business obligations he has tacitly 
or otherwise accepted. 

Potential for Growth. The degree of probability that the 


individual will develop the personal resources to cope with increasingly 


» 


ella - wok , 


| ae wi io a 2 £ Le TC 4 
) ea. 7 7 7 
Isrefieg” i git ak 
. 


Isubivibat at. ytiauose otesd 3 


anoitsutie stom dtiw ae eu: Ve pitde 


.etetnvoots ef efeatsq bas aan ae vive 


-99Tts Isutided 2! isubivibat sit: 4889 vie 
estenibioda gaiifoitros bas anitsnibto-oo jaatioedipnd 


eainies Isubivibat edt doidw oF panel one, mnt ; a 


-qisd ,noitoenth grpisse SUCH tM acitilidisacqzes & tO 
a Pyne” 
x Dg —steee tual a 


2id sien ot beem 2'feubivibar sat to sexgab oT. _ | 
soph ys 
.ots ,.ae00d gwo eid sd ri eae wo. eid SteLuger eencietoed a nor 
dtiw yidsivetmos ages oF ytilids 3o Level sit nena 
pb aie 


~erstrow-0D mort sonswwaaset fo\bas ti 


sit teit yoiildsdesy to ants bul otk os 10% meth ‘ 
yigatuseront dtiw: sqo0 oF 2esomuoeer I 


a7) 2 a ee Bo are ah 


124 


more complex and responsible work roles. 


General Suitability for Job Concerned. Self-explanatory. 


125 


APPENDIX 2 


Interview Rating Form 


© XIGUGSTA 


miol gaitgh weivastal 


CANDIDATE'S NAME 


CANDIDATE'S AGE 
ASSIGNMENT NUMBER NAME 
DATE 


RATER'S NAME 


Rate the candidate on each of the following characteristics according to 
the following code. Place the number that represents the most correct 
description in the space provided opposite each characteristic. 

1 = Poor; 2 = Marginal; 3 = Adequate; 4 = Good; 5 = Very Good 

If you are genuinely unable to rate a candidate on a characteristic, 


leave the space opposite that characteristic blank. 


1. General Intelligence  __ 10. Readiness to Learn pees 
2. Common Sense rs 11. Management Level Planning ae 
3. Oral Communication ae 12. General Energy Level ae 
4. Work Drive ae 13. Efficiency of Application ee 
3. Interpersonal /bffect. 14. Self-Confidence see 
6. Leadership Force eee 15. Supervisory Effectiveness oe 
7. Self-Reliance eee 16. Autonomy Be 
8. Adaptability ta hek 17.. Responsibility ee 
9. Potential for Growth 18. General Suitability 


INTERVIFY RATING FORM 


© -ofteizvetosisis dope pd keoqdo E 

bood yreV = 2 ¢bood = # cosaupebA * € cies 6 
~oiteinetosisio 5 ao sisbhbibass & stsr ot etden yLomlunsy ems vo 
Aneid Salas l tsit etkeogge 9268qe. ie 


assed ot eesmtbset .OL 
gainnsl4 level toemegsasM ,{f 
level yarend Istoned .SL 
aottssifagA to yoretoitid «eL 
| sonsbitnod-tise .#L 
eeensvitostid yroeivisque .éL 
ymonoiuA .of 

ytilidienoqeel ..f 


usilidstiue Letsnsd .8L 


Merk 


APPENDIX 3 


Interview + Test Rating Form 


CANDIDATE'S NAME 


CANDIDATE'S AGE 
ASSIGNMENT NUMBER NAME 


DATE 


RATER'S NAME 

Rate the candidate on each of the following characteristics according to 
the following code. Place the number that represents the most correct 
description in the space provided opposite each characteristic. 

1 = Poor; 2 = Marginal; 3 = Adequate; 4 = Good; 5 = Very Good 

If you are genuinely unable to rate a candidate on a characteristic, 


leave the space opposite that characteristic blank. 


1. General Intelligence 10. Readiness to Learn ae 
2. Common Sense _ ee ll. Management Level Planning eee 
3. Oral Communication ete 12. General Energy Level ee 
4. Work Drive Rae ha 13. Efficiency of Application a 
5. Interpersonal/Effect. 14. Self-Confidence oe. 
6. Leadership Force ns 15. Supervisory Effectiveness pote ae 
7. Self-Reliance as 16. Autonomy a 
8. Adaptability ae 17.. Responsibility eae 
9. Potential for Growth 18. General Suitability 


INTERVIFW + TEST RATING FORM 


+ ynkbaooos esiselistosisdo ae - to ee 10 iebedinns ent met 
sserton geom eft etaseerqes tadt tedmum ont eo6f% .sboo yaiwoLlot edt 
ntepinejoeusdo dose etieoqge Babivexg sosge elit ef notigiapesb 

bood yresV¥ = @ ;bood = + “seseuoebA = & jisaigesM > & prot = £ 


.olteinesosiedo 6 no essbibass 5 Stet OF eldsay yleniuneg 816 “oY 2 


winefd oitelcvotoexsds tad? stieeqgo sasqe edt eveel 


aneed oF eeonibset .0OL _  somegiifetal IsteasD .L 
gtinasit feved tnemeysaeM .Lf ia gene como ey 
Isved yatend [srsneD .St | Lae O noitspkarmitod Lex .€ 
noltsoilqqA to yoretoittd .6£ , | avied AxoW .# 
. eorsbiimo)-t1e2 if =| ss TARE Lseosseguetal ec 
seonsvitosttl yroeivieque aL ook: soto? gidewsbsed .8 
ymonctuA .OL mal sonailet~‘ise fc 

ytilidienoqesd ,.VL ere yitiidsrqsbA .6 
Witidssive Isveasd. .BL dswowd cot isttasz0l +e 


MaoT OMLTAM Tear + MAT 


= 


129 


APPENDIX 4 


Test Rating Form 


130 


CANDIDATE'S NAME 


CANDIDATE'S AGE 
ASSIGNMENT NUMBER NAME 


DATE 


RATER'S NAME 


Rate the candidate on each of the following characteristics according to 
the following code. Place the number that represents the most correct 
description in the space provided opposite each characteristic. 

1 = Poor; 2 = Marginal; 3 = Adequate; 4 = Good; 5 = Very Good 

If you are genuinely unable to rate a candidate on a characteristic, 


leave the space opposite that characteristic blank. 


1. General Intelligence 10. Readiness to Learn eee: 
2. Common pense at ll. Management Level Planning pany 
3. Oral Communication oe. 12. General Energy Level ——s 
4. Work Drive patce : 13. Efficiency of Application aaa 
S. iInterpersonal/Effect. 14. Self-Confidence Tue 
6. Leadership Force eae 15. Supervisory Effectiveness nae 
7. Self-Reliance wae 16. Autonomy aoe 
8. Adaptability ras 17.. Responsibility nog 
9. Potential for Growth 18. General Suitability 


TEST RATING FORM 


— 


ae ta "an 


ot ai ca zoitelsetosisdo \galwatteeen sits he rage enemige? i) . 
doem1oD teom eff etnszenqe7 tats todmun ‘edt sos ld ee 
Loiteluefosteds dose stieoqqo: habivorq songe of? ak ann : ia 

bond yasV = 2 ;bood = # gatsupabA = & inatgiom = 8 proof = 
,oiteisetonisdio 5 no sisbibnss 5 stat of oldsnu YLecivaeg 956 voy 21> : | 


ms oo 
asid streinetostsis stadt stizoqqo sosge edt eveol 


fised of easaibsst .OL 
gainneld level tasmegsasM «ff 
level yquend Leroned Sf 
noLtsoifiqaA to yonetottta .ti 
sonsbiinol-tise .#1 
ezenevitostid yuoeivisqué .ci ' 
ymorosuA .3f 

ytiLidtenoqzes «VI 


— 
Se 
————— 
eee tee 
o_o 
——— 
—_——— 
xXx —— 
> 


wiilidetive erties) .8L 


I 


AOR STR EAR Te3T) 


isi 


APPENDIX 5 


Dunnette (1971) Table 1 


@ XICUSITA 


£ efdsT (LY@L) streanud 


Table 1 
Assessment Methods Showing High Correlations with Each of Eight 
Behavior Rating Factors and Overall Staff Prediction for College 
and Non-College Men in the AT&T Management Progress Study 


aI ar re rT ee i 
Assessment method College men Non-college men 


Factor I. General Effectiveness 


Performance in Cooperative Group Exercise .60 
Performance in Competitive Group Exercise 67 
Performance on In-Basket 60 59 
Interview: Personal Impact sy) 48 
Projective: Leadership Role 48 51 
Personality Test: Dominance .33 Roe 
Factor II. Administrative Skills 
Performance on In-Basket -16 68 
Performance in Competitive Group Exercise 48 1 
Mental Ability Test 34 72 
Interview: Personal Impact 42 24 
Oral Communications Skills 33 53 
Projective: Leadership Role .36 .36 
Personality Test: Dominance -30 .30 
Factor III. Interpersonal Skills 
Performance in Cooperative Group Exercise 39 ey? 
Performance in Competitive Group Exercise 62 45 
Performance on In-Basket 45S 49 
Interview: Personal Impact 44 25 
Human Relations Skills .28 46 
Factor IV. Control of Feelings 
Performance in Competitive Group Exercise 47 36 
Performance in Cooperative Group Exercise 37 -35 
Interview: Human Relations Skills -23 45 
Tolerance of Uncertainty 30 -40 
Projective: Leadership Role .29 46 
Dependence —-.28 —42 
Factor V. Intellectual Ability 
Mental Ability Test .10 62 
Interview: Oral Communications Skills 40 -47 
Factor VI. Work Orientation Motivation 
Projective: Work or Career Orientation .50 -56 
Interview: Personal Impact -36 -50 
Inner Work Standards 40 -43 
Performance in Cooperative Exercise : 30 -39 
Performance in Competitive Exercise 45 -36 
Performance on In-Basket - 44 -26 
Factor VII. Passivity 
Interview: Need Advancement Si) —.67 
Personal Impact —.38 —.58 
Need Security -50 37 
Projective: Leadership Role ~.47 —.40 
Achievement Motivation —.44 -.50 
Performance in Competitive Exercise Siare3.9 —.36 
Performance in Cooperative Exercise - =.35 —.34 
Personality Test: General Activity - —.43 
Factor VIII. Dependency | 
Projective: Affiliation -46 41 
Dependence 49 as/ 
Overall Staff Prediction 
Performance in Competitive Exercise .60 38 
Performance oa In-Basket 55 51 
Performance in Cooperative Exercise 41 42 
Interview: Personal Impact 49 pail 
Oral Communications Skills 41 48 
Projective: Achievement Motivation 30 40 


a 


ET 


doi 


aie se ae 


ines wolveielh 


aaron tah cone 


zytiaat Yo foudgO VI 


th. stint quot) enitoqmo} al ranemohst 
ve. siowct ogi lersqeoD ni oasmrotsyt 
es. noitaea nemuH cwsiveial 
o£. yininsrsunl! Yo aanertoT 

cS slo 4 zavitontort 
i> i 

wiilidA Isutoalistal .V rola 

or. ib mot yMOGA tatashd 
on, alia? snoliecinusime > IO rwotnatal 
nolteviioM nolleimnO. AoW LV worse 

02. molten mee) 10 foW. covitsojort 


“ eA pd 
teu a 


133 


APPENDIX 6 


Principal Components Factor Analysis (Varimax Rotation) 


of Characteristics Appraised in Test Rating Condition 


4 ih 7.) 7 
: 


a 
v a 
7 > : a 
2 ne re nad ‘ Pre Nee ee - y 
(acizstox xomisev ) afaylenk TOTSET sroqmod Le 


y - : 
aoktibmod gattce® teeT ni baeiergqgé entteiwexoe 
ne Yum A : 


APPENDIX 6a 
Principal Components Factor Analysis (Varimax Rotation) 
of 18 Characteristics Appraised in Test Rating Condition: 


Clinician #1 (N=74) 


Appraised Factor  Faetor Factor Factor Factor 


Characteristic nl 2 3 4 5 oe 
i .83 ~.09 = 05 07 09 .70 

2 .22 16 at 05 eal 46 

3 .38 77 Sah 18 06 79 

l ~210 10 “08 83 S108 sn 

5 2h ge 26 = 08 37 74 

6 EG 76 £19 09 26 am 

7 ~.04 20 38 19 62 60 

8 77 11 ate ~.20 Ania .68 

9 8h 29 .18 10 05 83 

10 82 13 20 ale -.20 .78 

‘ul . 86 .06 = 106 80 -~.003 Rie 

12 12 05 39 76 17 77 

we 02 103 85 06 07 74 

14 05 .80 (oe 24 12 ap 

15 12 72 32 Ayes 01 75 

16 .20 23 aor 03 73 62 

17 10 20 8h 12 hei .78 

18 47 41 538 001 12 80 
Variance 3.96 3.32 2.45 161 1-56 12.91 
ean 22% 18% 14% 9% 9% 72% 


134 


S*H 


(colts 


fotDbT 


é 


80. 
idi= 


7 : ry re - : i 
st0h xem v) ee ie A Yotoe7 
af fT, : ‘oie \ t 


shuecteast he 1358 0 


aotoBt ’ . 
di. 


YO. 
e0. 
Si. 
es. 
$0,- 
e0. 
Pedi. 


La? 
™ 
* 
i 
. 


135 


APPENDIX 6b 
Principal Components Factor Analysis (Varimax Ratation) 
of 18 Characteristics Appraised in Test Rating Condition: 


Clinician #2 (N=7%) 


Appraised Factor Factor Factor Factor Factor H*2 
Characteristic i 2 3 4 5 

il 76 02 -.03 14 32 70 

2 25 22 =.23 .68 08 64 

2 82 -.02 04 ak -.07 72 

4 -.002 05 14 O04 83 71. 

5 a ked) -.09 25 71 SO) 60 

6 -.27 46 38 36 -.12 aM! 

7 04 2 . 80 -.06 -.005 10 

8 66 20 oak 05 -.37 67 

2) 7h 45 13 14 -.13 76 

10 ip 1 -.08 -.05 -.29 76 

ome 85 ~.03 605 08 05 73 

ae 1 68 apie) 07 09 S32 

13 -.20 56 -.20 -.13 53 69 

14 -.008 -.04 oie £9 O4 61 

ES 05 36 aval 70 -.08 67 

16 Ale 009 79 07 05 65 

Ly 25 76 -.10 18 =.03 69 

18 41 BOO . 30 soe .09 agli 

Variance S.87 Zon Dot HERS) 1.36 125.8 


tuor Lotal 
Variance 


jira nn nnn ne ee ee UUU EE tgy yn nS nn NUN pERR UE ERR ROERmeneemeeeieemeeeneeneeemeeees eed 


“= ptoa'T 


sotost ots6E 


a 


ai, 
8d. 
iS. 


G8. 


AT Beil 


— 


ar 


oo” 


. 


a oP 
_ © oo J foo om a 


APPENDIX 6c 


Principal Components Factor Analysis (Varimax Rotation) 


of 17 Characteristics Appraised in Test Rating Condition: 


Appraised Factor 
Characteristic ip 
ul =. 07° 
2 -.10 
cs 
3) 48 
6 -18 
i) ales 
8 06 
g -.04 
10 05 
Ie 14 
£2 66 
13 mle 
14 66 
LS 57 
16 20 
Ey -.04 
18 -48 
Variance 2.44 


mio Total 
Variance 


Clinteian #3 (Nz75) 


Factor 


2 


Factor Factor Factor H*2 
3 4 5 
i 66 14 53 
-.04 69 -.21 54 
10 -.10 -.68 2 
30 di, = .45 62 
as} -.08 -.26 76 
¥D =o] 10 63 
-.09 -.05 10 61 
31 ums) -.18 ts 
ok =< L0 -.08 517] 
02 34 iy Do 
23 a Oo O4 Sul 
5g 32 24 Se) 
Sy ey, =.15 -.002 Heke) 
-.14 41 37 65 
ew ser Bl 61 55 
Alsi) <7 -.26 O60 
5 31 Seyi} 66 
2.24 Leo oro AMO PSAs 
13% 9% 9% 60% 


136 


pas oases be ‘ coat 


"i somer' ro9981 = 


s2. wk. 
a2, ts. 
v2. 8d.- 
ea. ea. - 
at. as.- 
e8. of. 
fa. OL. 
6c. «BEL 
Tz. 90.- 
8é. SL. 
Ba, 40. 
Ee, us. 
6a. $00.- 
2a. vs. 
iy? *) fas 
aa. ar 
aa, ete 
S8.0L 225 


#03 &e 


Gc.t #8, 


Fe S65 


>) 


APPENDIX 7 


CAPSULE SUMMARY OF TESTS 


sl 


APPENDIX 7 
CAPSULE SUMMARY OF TESTS 

Listed below is a short description of each of the tests used 
in this study. For complete information regarding a specific test, the 
reader is referred to the appropriate test manual. 
Differential Aptitude Tests 

Verbal Reasoning. This is a verbal concept understanding test. 
It is designed to evaluate the ability to abstract, generalize, and 
to think constructively. Testing format involves verbal analogies. 

Abstract Reasoning. This is a non-verbal reasoning ability test. 
The testee is required to formulate operating principles in changing 
abstract diagrams. Operating principles involve the use of logic. 
Wonderlic Personnel Test 

The Wonderlic is a test of mental ability. It is widely used 
as a selection tool in hiring and as an indicator of future develop- 
ment possibility. 


Watson-Glaser Critical Thinking Appraisal 


This test involves the appraisal of important critical thinking 


skills (inference, recognition of assumptions, deduction, interpretation 


and evaluation of arguments) in everyday situations. 


Business Judgement Test 


This test is designed to measure empathy and knowledge of generally 


accepted ways of behaving in business interpersonal situations. 
Test of Practical Judgement 

This test is designed to evaluate the testee's ability to select 
the best solution to factual and complex interpersonal business 


problems. 


138 


5 7 ¥ 7 . : a : 
boey atest oft to dose to sotaghaabob nee woled aa is 
sdt ,teet oftioege s 3nibiszex nod temoiak stelgmos Brel’ webu sia at 
.leunsm teet etsingorqqs edt oF. bagitec f 


bos ,osifsieneg ,josateds o yittide efit stsulsve of shea ef $I ‘? 
2otyoisns Isdvev esviovai temxot gnbrest .Ulevitovrtenos Asidt oF ws 
.test ytilide yatnozssr Isdvev-nom s 2f ebdT -gaitogsel tosrtedA ; 
aciggsdo si aclqtonisg gatteusqo stslumrot of boriuper af seteet eft 
.oigol to say sdt evievai eslqtonicq gaitsisgO .emsrgsib tostteds | 
bseu yisbiIw ei tI .vttLids Ietaem to tess 1 ai otivebaoW baT | 
-qolsveb sxstut to rotsotbat os 26 bas gaisvtd at Loot soktosiee 6 as 


guitnids fsoitico tnusetvogmi to {setsigqgs sit esyloval teet elnT ; 
toiteteugisiai ,noitoubsb ,anoitaqnvees to noktiagosss .someastak) allivte — 


enoitsuste ysbyrvove si (etmemwgas to moiteulsve bas 


yileveneg to sybelwoml bas yiteqme a ot bengiesb ei tesy 2iAT = 


.etottsutia Isnosaquotnt Seensene at epaihed to eyew ace 


4 f 7 
- 


yoelee ot ytilids 2'asteet sdt stsulsvse of | Seitailee el teet etaT 


zeenieud Letoetsqretai xsfqmos bis Lsvtos? ot aoitutoe tesd 


L393 


Supervisory Practices Test 


This test is designed to appraise supervisory ability or potential 
ability. It is directly concerned with supervisory thinking, attitudes, 
and opinions. 

Management Aptitude Inventory 

This inventory is nee ened to assess characteristics related 
to success in managerial positions (intelligent job performance, 
leadership qualities, proper job attitude, and relations with others). 
Vocational Preference Inventory 

This is personality questionnaire which uses preference for vocat- — 
ional titles as a measure of personality style. It is designed to 
assess areas such as interpersonal relations, interests, values, self- 
conception, coping behavior, and identification. 

Edwards Personal Preference Schedule 

This personality test provides a convenient measure of normal 
personality variables such as achievement, deference, order , exhibition, 
autonomy, affiliation, intraception, succorance, dominance, abasement, 
nurturance, change, endurance, heterosexuality, and aggression. 
California Psychological Inventory 

This is a multiple choice personality test which measures 18 
personality variables in four general areas (measures of poise, 
ascendancy, and self assurance; measures of socialization, maturity 
and responsibility; measures of intellectual potential; measures of 


personal orientation and values). 


_sonenrotasg dof snsgitlesuk) een ie 


iisase dtiw ecoitsie: bas ,obutitts dot MBGONG 2 


-TED0V Yot eonenetsaq essex dotdw unk ieee et tonosneg - etdT 


Se 


ot hengtasb ef Ti vatyda vrtueagarbg to sivesom 5 26. sotgts tsgok Aes 


,eaoktpiss ibtoexeqretal, a5 HOUM, ROOT | Ae 


-tfee .geulsv ,etesrstat 


Lenrsoa to Sasansem tibinevaod 5 ieee test ytitsooereg. Moree ; 
- 
_moitidtdxs .ysbro ,sonsysted ,tnomeveldos es foue eoldsiasy ii ai Vo 


~toemeesds ,eomenimob ,sohstoSaue Moltqsosctat ENE, a 
/noltezeqags bas , Vtilsuxseorsted pore eee clans 7 


; Sh sonuesem foitw gest ~tilsuceyeq pi aie 6 sb abo ¥ 
,92t0g to esquesem) 2505 Isiaqey act - teded my 


\ rar. el cont nag a | 
iia ‘ hae — 
; uy 


7 “a 
i " rs er a . “ ate 


Spas et ar eemymyerert 
ime ober ae 


aire pep on 
ees trmypwn nea ; : = 
: angen yas wes em 


